Despite a couple forward-looking statements, I didn’t read this as a prediction. It seems more of a ...

tobyjsullivan • last Monday at 11:02 PM • 2 replies • view on HN

Despite a couple forward-looking statements, I didn’t read this as a prediction. It seems more of a subjective/anecdotal assessment of where things are in December 2025. (Yes, with some conjecture about the implications for next year.)

Overall, it echos my experience with Claude Opus 4.5 in particular. We’ve passed a threshold (one of several, no doubt).

Replies

aeonfox • last Tuesday at 5:14 AM

Just to test out the OP articles theory, I was about to write some unit tests. I decided to let Opus 4.5 have a go. It did a pretty good job, but I spent probably as much time parsing what it had done as I would have writing the code from scratch. I still needed to clean it up, and of course, unsurprisingly, it had made a few tests that only really exercised the mocking it had made. A kind of mistake I wouldn't be caught dead sending in for peer review.

I'm glad the OP feels fine just letting Opus do whatever it wants without a pause to look under the covers, and perhaps we all have to learn to stop worrying and love the LLM? But I think really, here and now, we're witness to just another hype article written by a professional blogger and speaker, who's highly motivated to write engagement bait like this.

➕ show 3 replies

noodletheworld • last Tuesday at 8:19 AM

Yeah nah.

People posting stuff like this are clearly not doing it; they’re reading LinkedIn posts, toying with the tech and projecting what it looks like at scale.

That’s fair; but it’s also misguided.

Either try it yourself, or go and watch people (eg. @ArminRonacher) doing this at a reasonable scale and you can ground yourself in reality, instead of hype.

The tldr is: currently it doesn’t scale.

Not personally. Not at Microsoft. Not at $AI company.

Currently, as the “don’t change existing behaviour” constraint list goes up, the unsupervised agent capability goes down, and since most companies / individual devs don’t appreciate “help” that does something while breaking something else, this causes a significant hole in the “everyone can 10x” bed time story.

As mentioned in other threads; the cost and effort to produce new code is down, but the cost of producing usable code is, I guess, moderately on par with existing scaffolding tools.

Some domains where the constraints are more relaxed like image generation (no hands? Who cares?) and front end code (wrong styles? Not consistent? Who cares?) are genuinely experiencing a revolution like the OP was talking about.

…but generalising it to “all coding” appears to be like the self driving car problem.

Solvable? Probably.

…but a bit harder than people who don’t understands or haven’t tried to solve it themselves thought, or blogged about or speculated about.

Probably, there’s a much smaller set of problems that are much easier to solve… but it’s not happening in 2026; certainly not at the scale and rate the OP was describing.

You’ll notice, neither of us have self driving cars yet.

(Despite some companies providing cars that do drive themselves into things from time to time, but that’s always “user error” as I understand it…)

alt Hacker News

Replies