logoalt Hacker News

dreis_swyesterday at 9:14 PM1 replyview on HN

I agree with your sentiment, this incremental evolution is getting difficult to feel when working with code, especially with large enterprise codebases. I would say that for the vast majority of tasks there is a much bigger gap on tooling than on foundational model capability.


Replies

qingcharlesyesterday at 10:20 PM

Also came to say the same thing. When Gemini 3 came out several people asked me "Is it better than Opus 4.1?" but I could no longer answer it. It's too hard to evaluate consistently across a range of tasks.