logoalt Hacker News

pierrekintoday at 4:50 PM11 repliesview on HN

There is something darkly comical about using an LLM to write up your “a coding agent deleted our production database” Twitter post.

On another note, I consider users asking a coding agent “why did you do that” to be illustrating a misunderstanding in the users mind about how the agent works. It doesn’t decide to do something and then do it, it just outputs text. Then again, anthropic has made so many changes that make it harder to see the context and thinking steps, maybe this is an attempt at clawing back that visibility.


Replies

vidarhtoday at 7:12 PM

If you ask humans to explain why we did something, Sperry's split brain experiment gives reason to think you can't trust our accounts of why we did something either (his experiments showed the brain making up justifications for decisions it never made)

Bit it can still be useful, as long as you interpret it as "which stimuli most likely triggered the behaviour?" You can't trust it uncritically, but models do sometimes pinpoint useful things about how they were prompted.

show 6 replies
59nadirtoday at 5:05 PM

> a misunderstanding in the users mind about how the agent work

On top of that the agent is just doing what the LLM says to do, but somehow Opus is not brought up except as a parenthetical in this post. Sure, Cursor markets safety when they can't provide it but the model was the one that issued the tool call. If people like this think that their data will be safe if they just use the right agent with access to the same things they're in for a rude awakening.

From the article, apparently an instruction:

> "NEVER FUCKING GUESS!"

Guessing is literally the entire point, just guess tokens in sequence and something resembling coherent thought comes out.

show 2 replies
NewsaHackOtoday at 5:03 PM

Twitter users get paid for these 'articles' based on engagement, correct? That may be the reason why it is so dramatized.

show 1 reply
jeremyccranetoday at 8:04 PM

Not some vibe coder, and AI agents can be incredibly powerful. But yes, the irony is not lost on us!

show 1 reply
khazhouxtoday at 8:03 PM

> systemic failures across two heavily-marketed vendors that made this not only possible but inevitable.

> No confirmation step. No "type DELETE to confirm." No "this volume contains production data, are you sure?" No environment scoping. Nothing.

> The agent that made this call was Cursor running Anthropic's Claude Opus 4.6 — the flagship model. The most capable model in the industry. The most expensive tier. Not Composer, not Cursor's small/fast variant, not a cost-optimized auto-routed model. The flagship.

The tropes, the tropes!!

https://tropes.fyi/

xnxtoday at 8:39 PM

An LLM will reply with a plausible explanation of why someone would have written the code that it just wrote. Seems about the same.

badgersnaketoday at 7:35 PM

Seems like they’ve already reached the point where they’ve forgotten how to think.

jayd16today at 7:15 PM

Beyond that, isn't it just going to make up a narrative to fit what's in the prompt and context?

I don't think there's any special introspection that can be done even from a mechanical sense, is there? That is to say, asking any other model or a human to read what was done and explain why would give you just an accounting that is just as fictional.

oofbeytoday at 7:03 PM

> It doesn’t decide to do something and then do it, it just outputs text.

We can debate philosophy and theory of mind (I’d rather not) but any reasonable coding agent totally DOES consider what it’s going to do before acting. Reasoning. Chain of thought. You can hide behind “it’s just autoregressively predicting the next token, not thinking” and pretend none of the intuition we have for human behavior apply to LLMs, but it’s self-limiting to do so. Many many of their behaviors mimic human behavior and the same mechanisms for controlling this kind of decision making apply to both humans and AI.

show 2 replies
gobdovantoday at 7:23 PM

> asking a coding agent “why did you do that” to be illustrating a misunderstanding in the users mind about how the agent works

I think the same thing, but about agents in general. I am not saying that we humans are automata, but most of the time explanation diverges profoundly from motivation, since motivation is what generated our actions, while explanation is the process of observing our actions and giving ourselves, and others around us, plausible mechanics for what generated them.