Ah cool. So there is data? How consistent are humans?
What I'd really love is an actual number for a "human hallucination rate". How often will a random human
1) claim something that is wrong
2) defend the wrong claim and/or logic even when the problem is pointed out to them
(and this of course outside of the usual topics. In politics? I don't care. In religion? Don't care (well, maybe a bit more than politics). Let's say in physics or popular logic or something like that)
There is evidence that children will oscillate between understanding and not understanding while learning topics. Philip Sadler at Harvard published about this but i can't find the paper im thinking of on his google scholar. too many papers!
but moreover, to verify a test item you need to make sure that peopel will select the same answers under teh same conditions at different times. people generally forget the specific questions they were asked if you ask them the same questions a month later so being able to get them to answer the same way each time is important. it is assumed the people have some static knowledge of a topic in this scenario.
If you want to consider a statistical examination of how people answer tests and how we assess knowledge and other things in people through surveying you can read about item response theory and rasch analysis.