logoalt Hacker News

naaskinglast Thursday at 4:17 PM1 replyview on HN

Did the models from 2 years ago produce more bugs, fewer bugs or the same bugs as today's models? Do you think next years AI models will produce the same number of bugs, more bugs, or fewer bugs?


Replies

kentmlast Thursday at 7:55 PM

> Did the models from 2 years ago produce more bugs, fewer bugs or the same bugs as today's models?

Is anyone actually tracking that with a methodology not prone to fine-tuning? Specifically, I know a lot of the tests have the problem that you can train the AI to pass the test, so a higher score is not indicative of overall higher performance. I'm not actually being rhetorical here to make a point; I'm genuinely interested if anyone has derived a methodology that gives confidence behind these claims.

(Aside: Its not a huge stretch to claim that they're getting better, but it mostly seems anecdotal from this point, or using methods that have the above problem I stated)

show 1 reply