logoalt Hacker News

timfsulast Tuesday at 12:22 AM1 replyview on HN

We saw this too with Gemini specifically. My favorite example - we built a hallucination detector (given the input, does the output make any false claims) in Gemini, and after the Seahawks won the Superbowl in February, it would consistently flag that as "not possible".


Replies

emodendroketlast Tuesday at 3:53 AM

I believe it was assuring me the Israelis would never invade southern Lebanon and declare a buffer zone inside it after that had already happened.