logoalt Hacker News

dubcanadayesterday at 8:42 PM1 replyview on HN

grok is 17%? And that's the lowest, most models are like 80%+?

While hallucination is probably closer to 100% depending on the question. This benchmark makes no sense.


Replies

elAhmoyesterday at 9:16 PM

No one serious uses grok.

show 3 replies