>> "LLMs have reached a plateau." you should look at benchmarks such as ARC which...

singularity2001 • yesterday at 11:41 PM • 1 reply • view on HN

>> "LLMs have reached a plateau."

you should look at benchmarks such as ARC which went from "needs 10 years, currently at 0%" to almost solved within the least year. Also there is a revolution happening in math which the layman might be missing.

Replies

bloppe • today at 1:56 AM

I don't care about the benchmarks. I care about how helpful coding agents are for my work. And I can barely tell the difference between the models this year and the models last year. Everyone's raving about Opus but I bet about 50% of people would be able to identify it in a blind test against Sonnet.

alt Hacker News

Replies