What we call AI at the heart of coding agents, is the averaged “echo” of what people have published ...

isodev • today at 12:20 AM • 2 replies • view on HN

What we call AI at the heart of coding agents, is the averaged “echo” of what people have published on the web that has (often illegitimately) ended up in training data. Yes it probably can spit out some trivial snippets but nothing near what’s needed for genuine software engineering.

Also, now that StackOverflow is no longer a thing, good luck meaningfully improving those code agents.

Replies

logicprog • today at 1:14 AM

Coding agents are getting most meaningful improvements in coding ability from RLVR now, with priors formed by ingesting open source code and manuals directly, not SO, as the basis. The former doesn't rely on resources external to the AI companies at all, and can be scaled up as much as they like, while the latter will likely continue to expand, and they don't really need more of it if it doesn't. Not to mention that curated synthetic data has been shown to be very effective at training models, so they could generate their own textbooks based on open codebases or new languages or whatever and use that. Model collapse only happens when it's exclusively, and fully un-curated, model output that's being trained on.

blackcatsec • today at 12:42 AM

Exactly this. Everything I've seen online is generally "I had a problem that could be solved in a few dozen lines of code and I asked the AI do it for me and it worked great!"

But what they asked the AI to do is something people have done a hundred times over, on existing platform tech, and will likely have little to no capability to solve problems that come up 5-10 years from now.

The reason AI is so good at coding right now is due to the 2nd Dot Com tech bubble that occurred between the simultaneous release of mobile platforms and the massive expansion of cloud technology. But now that the platforms that existed during that time will no longer exist, because it's no longer profitable to put something out there--the AI platforms will be less and less relevant.

Sure, sites like reddit will probably still exist where people will begin to ask more and more information that the AI can't help with, and subsequently the AI will train off of that information; but the rate of that information is going to go down dramatically.

In short, at some point the AI models will be worthless and I suspect that'll be whenever the next big "tech revolution" happens.

alt Hacker News

Replies