A guy at work did a demo of an agent work flow for some higher ups (we have chatbots but haven'...

scuff3d • last Tuesday at 7:47 PM • 1 reply • view on HN

A guy at work did a demo of an agent work flow for some higher ups (we have chatbots but haven't adopted agents yet). He raved about how after writing a several hundred line spec, being extremely specific about the technology to use, and figuring out where to put all the guardrails, he was able to get Claude to generate weeks worth of code. When all was said and done it was like 20k lines of code between implementation, tests, and helper tools. Along the way he acknowledged you have to keep a close eye on it, or it will generate functions that pass tests but don't actually do their jobs, tests that pass but don't test anything, and a bunch of other issues.

To people with little to no practical software experience, I can see why that seems incredible. Think of the savings! But to anyone who's worked in a legacy code base, even well written ones, should know the pain. This is worse. That legacy code base was at least written with intention, and is hopefully battle tested to some degree by the time you look at it. This is 20k lines of code written by an intern that you are now responsible for going through line by line, which is going to take at least as long as it would have to write yourself.

There are obvious wins from AI, and agents, but this type of development is a bad idea. Iteration loops need to be kept much smaller, and you should still be testing as you go like you would when writing everything yourself. Otherwise it's going to turn into an absolute nightmare fast.

Replies

inetknght • last Tuesday at 7:53 PM

Even asking it to do little tests, Claude 4.5 Sonnet Thinking still ends up writing tests that do nothing or don't do what it says will do. And it's always fucking cheery about it: "you're code is now production-ready!" and "this is an excellent idea!" and "all errors are now fixed! your code is production-ready!" and "I fixed the compiler issue, we're now production ready!"

...almost as if it's too eager to make its first commit. Much like a junior engineer might be.

It's not eager enough to iterate. Moreover, when it does iterate, it often brings along the same wrong solutions it came up with before.

It's way easier to keep an eye on small changes while iterating with AI than it is with letting it run free in a green field.

➕ show 1 reply

alt Hacker News

Replies