Even using Fable (while it was briefly available), having it refine a plan, and directing it to make...

Aurornis • today at 2:28 AM • 10 replies • view on HN

Even using Fable (while it was briefly available), having it refine a plan, and directing it to make only small incremental changes, I still found reasons to reject its first pass at a lot of work. There was a lot of “You’re right to push back” responses. A lot of incidents where it would creat some giant complex set of abstractions to accomplish something that I could find ways to do much more elegantly and in a more maintainable manner.

It’s really eye opening to work with these tools on a codebase you know deeply because these problems are everywhere.

However if I opened an unfamiliar project in another language and I wanted to add a little feature with no intention of maintaining it, I’d happily accept the changes and loop until it worked well enough for my temporary needs.

The scary middle is when you’re dealing with coworkers who don’t care about anything other than closing tickets and collecting credit. With enough of a token budget you can now wrap loops around an LLM and have it try things until the program appears to work. Ask it to do a code review and then submit the PR without having understood what it was doing. There are a lot of workplaces where there isn’t a good mechanism to push back on this and the tech debt just keeps growing.

Replies

abhgh • today at 3:17 AM

These "You're right to push back" scenarios are scary for me. I mostly code ML implementations, and some of the errors Claude Code (CC - have only used Opus 4.7) makes are very sneaky, and if you don't have sufficient experience in the area (I see this with people entering ML and writing their implementations with CC), you wouldn't know when to question CC and will let errors or future pitfalls silently slip into your code. A recent example was when there was data leakage in a model calibration step, which it refused to see as an error, till I wrote a detailed reason, and then it agreed that there was a "subtle leakage".

➕ show 2 replies

latexr • today at 9:18 AM

In your second and third paragraphs you’re essentially describing Gell-Mann amnesia.

https://en.wikipedia.org/wiki/Michael_Crichton#%22Gell-Mann_...

resonious • today at 2:56 AM

All Claude models are huge suck ups. The "you're absolutely right" meme is real even if that exact phrase doesn't show up as much anymore.

I don't want to start a fight or anything but IME Codex has a bit more of a spine. If you point out something weird, it sometimes gives a good reason for it. Whereas Claude will always say "whoopsie you're right as always sir" even when it's me who missed something.

➕ show 3 replies

fy20 • today at 6:26 AM

A nice trick I've found is following up with "make it simpler". Often you can do 2-3 rounds of that and end up with something much easier to comprehend but still meeting the requirements.

I have a Rails background, so maybe KISS is more engrained in my philosophy than whatever training material was used on AI. At least it isn't heavily pushing design patterns...

dapperdrake • today at 6:28 AM

Maybe that feedback loop finally got fast enough to die out.

embedding-shape • today at 2:49 AM

> There are a lot of workplaces where there isn’t a good mechanism to push back on this and the tech debt just keeps growing.

If the "big ball of spaghetti" theory holds, where software companies who can't manage the debt stumble over themselves as they continue to add to the big ball of spaghetti code, I guess we'll see a row of companies declaring "software bankruptcy" or something in some/many months, depending on how well these workspaces learn to care slightly more and get better at pushing back against slop.

➕ show 4 replies

busterarm • today at 2:38 AM

> With enough of a token budget you can now wrap loops around an LLM and have it try things until the program appears to work. Ask it to do a code review and then submit the PR without having understood what it was doing. There are a lot of workplaces where there isn’t a good mechanism to push back on this and the tech debt just keeps growing.

I'm not making an argument in favor of people using LLMs for this, but people were doing this before we had LLMs it was just usually a bit slower. I can't even say it usually doesn't work out long term because I worked with a lot of guys who did this and took a ton of Adderall while working practically around the clock. Every incentive structure in the organizations rewarded it along with social credibility from more junior engineers. (The last cowboy I worked with who pulled this shit ended up becoming the most senior engineer in the company, a multi-millionaire and worshipped like a god by 90% of the mostly fresh grads we were hiring).

The problem is when invariably these people burn out eventually and leave, they leave a massive vacuum in their stead. Not from load they were carrying but creating.

I think the larger the organization I've been at, the more they reward the people making huge commits on nights and weekends. Worse, they could get away with TBRing their shit and merging it without review.

LLMs are often all of the bad habits and organizational problems that we already carryied just being speedrun. There are some places doing it right, but they already were.

➕ show 1 reply

darkerside • today at 3:40 AM

In fairness, you could throw the most senior engineer into a brand new codebase, and they would probably make a dozen mistakes if you immediately had them pick up invasive and risky work.

➕ show 1 reply

justinclift • today at 5:58 AM

> "You’re right to push back"

It sounds like you've not conditioned your Claude to stop being a sycophant yet?

alt Hacker News

Replies