logoalt Hacker News

singularity2001yesterday at 11:41 PM1 replyview on HN

>> "LLMs have reached a plateau."

you should look at benchmarks such as ARC which went from "needs 10 years, currently at 0%" to almost solved within the least year. Also there is a revolution happening in math which the layman might be missing.


Replies

bloppetoday at 1:56 AM

I don't care about the benchmarks. I care about how helpful coding agents are for my work. And I can barely tell the difference between the models this year and the models last year. Everyone's raving about Opus but I bet about 50% of people would be able to identify it in a blind test against Sonnet.