logoalt Hacker News

ehntotoday at 12:47 PM0 repliesview on HN

Does an LLM scoring well on the Mensa test translate to it doing excellent and factual police reporting? It is probably not true of humans doing well on the Mensa, why would it be true of an LLM?

We should probably rigorously verify that, for a role that itself is about rigorous verification without reasonable doubt.

I can immediately, and reasonably, doubt the output of an LLM, pending verification.