logoalt Hacker News

giwooklast Monday at 4:24 PM2 repliesview on HN

I wonder how much of this is simply needing to adapt one's workflows to models as they evolve and how much of this is actual degradation of the model, whether it's due to a version change or it's at the inference level.

Also, everyone has a different workflow. I can't say that I've noticed a meaningful change in Claude Code quality in a project I've been working on for a while now. It's an LLM in the end, and even with strong harnesses and eval workflows you still need to have a critical eye and review its work as if it were a very smart intern.

Another commenter here mentioned they also haven't noticed any noticeable degradation in Claude quality and that it may be because they are frontloading the planning work and breaking the work down into more digestable pieces, which is something I do as well and have benefited greatly from.

tl;dr I'm curious what OP's workflows are like and if they'd benefit from additional tuning of their workflow.


Replies

8notelast Monday at 4:28 PM

I've noticed a strong degradation as its started doing more skill like things and writing more one off python scripts rather than using tools.

the agent has a set of scripts that are well tested, but instead it chooses to write a new bespoke script everytime it needs to do something, and as a result writes both the same bugs over and over again, and also unique new bugs every time as well.

show 1 reply
germandiagoyesterday at 6:19 AM

> I wonder how much of this is simply needing to adapt one's workflows to models as they evolve and how much of this is actual degradation of the model,

I also wonder how much people are willing to adapt to non-reliability for the sake of laziness instead of, at some point, do a proper take the lead and solve a problem if you have the knowledge + realiable resoources.

It seems to me, the way you phrase it, that anything a human comes up with when coding must go through an LLM. There are times it helps, there are tasks it performs, but I also found quite often tasks for which if I had done it myself in the first place I would have skipped a lot of confusion, back and forth, time wasting and would have had a better coded, simpler solution.

show 1 reply