you should look at benchmarks such as ARC which went from "needs 10 years, currently at 0%" to almost solved within the least year. Also there is a revolution happening in math which the layman might be missing.
I don't care about the benchmarks. I care about how helpful coding agents are for my work. And I can barely tell the difference between the models this year and the models last year. Everyone's raving about Opus but I bet about 50% of people would be able to identify it in a blind test against Sonnet.
I don't care about the benchmarks. I care about how helpful coding agents are for my work. And I can barely tell the difference between the models this year and the models last year. Everyone's raving about Opus but I bet about 50% of people would be able to identify it in a blind test against Sonnet.