logoalt Hacker News

simianwordsyesterday at 5:27 PM1 replyview on HN

I trust you. If it were happening so frequently you may be able to give me a single prompt to get it to bullshit?


Replies

the_snoozeyesterday at 7:38 PM

I did this in one attempt just now: https://gemini.google.com/share/b4e016be1f69

#8 has an incorrect answer (3 appearances according to Gemini, 2 according to reality https://en.wikipedia.org/wiki/Bowl_championship_series#BCS_a...)

So it works well 95% of the time for literally a trivial use case. Imagine if any other tech tool had that kind of reliability: `ls` displays 95% of your files, your phone successfully sends and receives 95% of text messages, or Microsoft Word saving 95% of the characters you typed in. That's just not acceptable.

show 1 reply