>He wants it to be truthful How do you know this? Why would you believe him considering the mas...

Braxton1980 • yesterday at 10:44 PM • 1 reply • view on HN

>He wants it to be truthful

How do you know this? Why would you believe him considering the massive lies he's told, for example about the 2020 widespread election fraud

Replies

kvetching • yesterday at 11:20 PM

https://artificialanalysis.ai/evaluations/omniscience?omnisc...

AA-Omniscience Hallucination Rate (lower is better) measures how often the model answers incorrectly when it should have refused or admitted to not knowing the answer. It is defined as the proportion of incorrect answers out of all non-correct responses, i.e. incorrect / (incorrect + partial answers + not attempted).

Grok 4.2 which was just released in the API just benched the best at this benchmark.

➕ show 1 reply

alt Hacker News

Replies