logoalt Hacker News

ericmceryesterday at 4:22 PM5 repliesview on HN

I mean at this point can we just conclude that there are a group of engineers who claim to have incredible success with it and a group that claim it is unreliable and cannot be trusted to do complex tasks.

I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results, it seems much more likely to me that it can do well at isolated tasks or new projects but fails when pointed at large complex code bases because it just... is a token predictor lol.

But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.

It isn't like we don't use this stuff also, I ask Cursor to do things 20x a day and it does something I don't like 50% of the time. Even things like pasting an error message it struggles with. How do I reconcile my actual daily experience with hype messages I see online?


Replies

rurpyesterday at 4:50 PM

Right, I keep seeing people talking past each other in this same way. I don't doubt folks when they say they coded up some greenfield project 10x faster with Claude, it's clearly great at many of those tasks! But then so many of them claim that their experience should translate to every developer in every scenario, to the point of saying they must be using it wrong if they aren't having the same experience.

Many software devs work in teams on large projects where LLMs have a more nuanced value. I myself mostly work on a large project inside a large organization. Spitting out lines of code is practically never a bottleneck for me. Running a suite of agents to generate out a ton of code for my coworkers to review doesn't really solve a problem that I have. I still use Claude in other ways and find it useful, but I'm certainly not 10x more productive with it.

dns_snekyesterday at 5:26 PM

> But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.

I couldn't disagree with this more. It's impressive at building demos, but asking it to build the foundation for a long-term project has been disastrous in my experience.

When you have an established project and you're asking it to color between the lines it can do that well (most of the time), but when you give it a blank canvas and a lot of autonomy it will likely end up generating crap code at a staggering pace. It becomes a constant fight against entropy where every mess you don't clean up immediately gets picked up as "the way things should be done" the next time.

Before someone asks, this is my experience with both Claude Code (Sonnet/Opus 4.6) and Codex (GPT 5.4).

hombre_fatalyesterday at 4:26 PM

I suspect many people here have tried it, but they expected it to one-shot any prompt, and when it didn't, it confirmed what they wanted to be true and they responded with "hah, see?" and then washed their hands of it.

So it's not that they're too stupid. There are various motivations for this: clinging on to familiarity, resistance to what feels like yet another tool, anti-AI koolaid, earnestly underwhelmed but don't understand how much better it can be, reacting to what they perceive to be incessant cheerleading, etc.

It's kind of like anti-Javascript posts on HN 10+ years ago. These people weren't too stupid to understand how you could steelman Node.js, they just weren't curious enough to ask, and maybe it turned out they hadn't even used Javascript since "DHTML" was a term except to do $(".box").toggle().

I wish there were more curiosity on HN.

show 1 reply
rattlesnakedaveyesterday at 4:28 PM

“I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results”

Seemingly is doing the heavy lifting here. If you read enough comment threads on HN, it will become obvious why they aren’t getting results.

show 2 replies
jerfyesterday at 5:50 PM

I see two basic cases for the people who are claiming it is useless at this point.

One is that they tried AI-based coding a year or two ago, came to the IMHO completely correct at that time conclusion that it was nearly useless, and have not tried it since then to see that the situation has changed. To which the solution is, try it again. It changed a lot.

The other are those who have incorporated into their personal identity that they hate AI and will never use it. I have seen people do things like fire AI at a task they have good reasons to believe it will fail at, and when it does, project that out to all tasks without letting themselves consciously realize that picking a bad task on purpose skews the deck.

To those people my solution is to encourage them to hold on to their skepticism. I try to hold on to it as well despite the incredible cognitive temptation not to. It is very useful. But at the same time... yeah, there was a step change in the past year or so. It has gotten a lot more useful...

... but a lot of that utility is in ways that don't obviate skilled senior coding skills. It likes to write scripting code without strong types. Since the last time I wrote that, I have in fact used it in a situation where there were enough strong types that it spontaneously originated some, but it still tends to write scripting code out of that context no matter what language it is working in. It is good at very straight-line solutions to code but I rarely see it suggest using databases, or event sourcing, or a message bus, or any of a lot of other things... it has a lot of Not Invented Here syndrome where it instead bashes out some minimal solution that passes the unit tests with flying colors but can't be deployed at scale. No matter how much documentation a project has it often ends up duplicating code just because the context window is only so large and it doesn't necessarily know where the duplicated code might be. There's all sorts of ways it still needs help to produce good output.

I also wonder how many people are failing to prompt it enough. Some of my prompts are basically "take this and do that and write a function to log the error", but a lot of my prompts are a screen or two of relevant context of the project, what it is we are trying to do, why the obvious solution doesn't work, here's some other code to look at, here's the relevant bugs and some Wiki documentation on the planning of the project, we should use {event sourcing/immutable trees/stored procedures/whatever}, interact with me for questions before starting anything. This is not a complete explanation of what they are doing anymore, but there's still a lot of ways in which what an LLM can really do is style transfer... it is just taking "take this and do that and write a function to log the error" and style-transforming that into source code. If you want it to do something interesting it really helps to give it enough information in the first place for the "style transfer" to get a hold of and do something with. Don't feel silly "explaining it to a computer", you're giving the function enough data to operate on.

show 1 reply