logoalt Hacker News

andy99yesterday at 10:55 AM0 repliesview on HN

I suspect real AGI evals aren't going to be "IQ test"-like which is how I'd categorize these benchmarks.

LLMs will probably continue to scale on such benchmarks, as they have been, without needing real ingenuity or intelligence.

Obviously I don't know the answer but I think it's the same root problem as why neural networks will never lead to intelligence. We're building and testing idiot savants.