logoalt Hacker News

MontyCarloHalltoday at 2:57 PM6 repliesview on HN

   I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.
   — Boris Cherny, head of Claude Code
Reliability is a direct reflection of the quality of the underlying infrastructural code. If even Anthropic, the company with the world's best agentic vibecoders, has horribly unreliable infrastructure, it really says something about the quality of the world's best agentically produced code.

Replies

brooksttoday at 3:14 PM

Is there any indication these errors are related to Anthropic-written code as opposed to operational issues from the fastest-growing infra buildout ever?

Layer-wise, the app is pretty far removed from request routing to GPU pools.

show 2 replies
dsmurrelltoday at 3:00 PM

I wonder how they fix things when Claude is down.

show 6 replies
MattGaisertoday at 3:12 PM

On the other hand we are also willing to buy it, so reliability is arguably not as valued a good as people assumed.

show 1 reply
TacticalCodertoday at 3:46 PM

> If even Anthropic, the company with the world's best agentic vibecoders...

But that's really not what they have. They have AI experts who are creating incredible LLMs.

Everything else is more than meh: Claude Code is really bad. Such a turd would never have gained any traction if it wasn't for the LLMs behind it.

I use LLMs to code daily (Claude Code still, mind you, for I didn't take the time to switch yet) and these modesl are both amazing and pathetic.

If you don't verify everything they output, they do the absolute craziest thing imaginable.

One example is I got an Anthropic model notice a "pattern" in range bound integer values. I had them range bound between, e.g., 0xCAFE0000 and 0xCAFEFFFF. And at some point a comparison/validation was needed and instead of doing an integer comparison the Anthropic model went ballistic: instead of doing an integer comparison it converted the numbers to a string, then started doing substring matching on "0xCAFE" and went even more "expert" by verifying at which position the match was happening. All that while explaining why it couldn't possibly fail.

Why did it do that? Very likely because, in a comment, it saw "0xCAFE..." as a string. And the thing saw a pattern.

Can you believe it? There's a pattern. So it must light up connections. We've got a pattern!

Now amount of kludge, hidden pre-processing, hidden post-processing is fixing the "quality" of the code produced by something that, instead of doing an integer comparison, converts things to string and then does substring searches and indexes computation.

There's no fixing that.

Yesterday: had to use three guard clauses before pushing data... Two of the three "logic gates" (as the model would explain they were, which is kinda right) he got right. The third one: same thing... It was planning to go ballistic, introduce countless lines of code, insane abstractions, to make a test that was solved with a one line timestamp comparison.

It's because it does things like that that the people who explain that they don't code anymore are delusional if they think this gives, as of today, quality code.

It's like that other dude who was happy to produce 37 K LOC per day and counting.

> ... it really says something about the quality of the world's best agentically produced code

Oh it is totally shit code. But if you monitor everything and vet everything they do, it's helpful.

I find these LLMs way more helpful at finding the source of bugs (not fixing them: finding them, which is 90% of the job anyway) and at acting like rubber-ducks then at writing code.

Claude Code sucks. Claude Code CLI sucks. Their only "solutions" to all problems is to create VMs, headless browsers, and resort to incredible hacks (the infamous "game loop" that modifies the characters output by the LLM is just shameful) etc. to try to hide the misery. It's miserable kludges everywhere.

And the only reason these miserable kludges are not entirely falling apart is because they rest on the shoulders of actual giants: projects like Linux, QEMU, etc. that were not vibe-coded.

It's sad to have useful tools (the models) and to make such poor use of them.

I'm pretty sure that, in the end, it's just like open-source powering the entire world by now: we'll have open-source projects like Pi and then newer ones that are going to come out and fix the mess we have now. And they're not going to be 100% vibe-coded by people whose jobs is "to write loops".

show 1 reply
rvztoday at 3:48 PM

He is a salesman at this point and is not talking to you. He is talking to the investors who want to vibe code loops to waste tokens on building slop to get rid of you.

Goes to show how fake this industry has become when VC dollars have flooded it.

Somehow it is fine to vibe code infrastructure or security because someone (with a clear vested interest) wants you to spend more tokens at their casino because that is how they "win" at the casino (which they work at).

Except in reality, this part of software is critical and irresponsible to 'write loops" and we all know that he doesn't believe what he is saying.

show 1 reply
hombre_fataltoday at 3:06 PM

Meh, this is the "must be the veganism" fallacy: if someone knows you're vegan, then any ailment you might have, no matter how ubiquitous in the population, must be somehow due to your vegan diet and no more details are required.

Except now it's the "AI did it" fallacy where if you know a company uses AI, even infra scaling issues must be due to AI, and if you had just used less or no AI, you would have been spared even though that has never been true.

The usual response to this goes something like "well they made claims that AI is good" therefore anything short of perfection supposedly debunks the claim.

show 4 replies