OP's point is that this isn't valid because neither of the answers are correct. If you're really trying to measure a spectrum then the answers should allow for fuzziness. That is, you have a range/confidence interval of where green ends and where blue starts and in between is neither/both.
correctness is not the point. binary choice is the whole point. because my blue may not be your blue...