logoalt Hacker News

wasabi991011yesterday at 2:42 AM0 repliesview on HN

Hallucination rate is hallucination/(hallucination+partial+ignored), while omniscience is correct-hallucination.

One hypothesis is that gemini 3 flash refuses to answer when unsuure less often than other models, but when sure is also more likely to be correct. This is consistent with it having the best accuracy score.