logoalt Hacker News

JanStyesterday at 6:36 PM3 repliesview on HN

The benchmarks are very impressive. Codex and Opus 4.5 are really good coders already and they keep getting better.

No wall yet and I think we might have crossed the threshold of models being as good or better than most engineers already.

GDPval will be an interesting benchmark and I'll happily use the new model to test spreadsheet (and other office work) capabilities. If they can going like this just a little bit further, much of the office workers will stop being useful.... I don't know yet how to feel about this.

Great for humanity probably but but for the individuals?


Replies

llmslaveyesterday at 6:42 PM

Yeah theres no wall on this. It will be able to mimic all of human behavior given proper data.

ionwakeyesterday at 6:46 PM

it was only about 2-3 weeks when several HNers told me "nah you better re-check your code", when I explained I have over 2 decades xp of coding, yet have not manually edited code (in memory) for the last 6 or so months, whilst performing daily 12 hour daily vibe code seshes

show 2 replies
sheesheyesterday at 7:02 PM

Ok so why isn’t there mass lay offs ensuing right now?

show 1 reply