logoalt Hacker News

grandinquistortoday at 3:03 PM5 repliesview on HN

Quite a big improvement in coding benchmarks, doesn’t seem like progress is plateauing as some people predicted.


Replies

msavaratoday at 4:09 PM

Only in benchmarks. After couple of minutes of use it feels same dumb as nerfed 4.6

show 1 reply
cpan22today at 5:35 PM

But it majorly regressed in long context retrieval? Which is arguably getting more and more important?

verdvermtoday at 3:08 PM

Some of the benchmarks went down, has that happened before?

show 4 replies
William_BBtoday at 6:00 PM

Are you one of those naive people that still take these coding benchmarks seriously?

ACCount37today at 3:06 PM

People were "predicting" the plateau since GPT-1. By now, it would take extraordinary evidence for me to take such "predictions" seriously.