We're going to see a lot of this in the near future and it will be 100% earned. Too many people think that move fast and break stuff is the correct paradigm for success. Too many people using these tools without understanding how LLMs work but also without the requisite engineering experience to know even the lowest level stuff — like how to protect secrets.
I don't even like having secrets on disk for my personal projects that only I will touch. Why was there a plaintext production database credential available to the agent anywhere on the disk in the first place? How did the agent gain access to the file system outside of the code base?
The Railway stuff isn't great, don't get me wrong, but plaintext production secrets on disk is one of the reddest possible flags to me, and he just kind of breezes over it in the post mortem. It's all I needed to read to know he doesn't have the experience required to run a production application that businesses rely on for their day-to-day.
Correction: They deleted their prod db and then they had another agent write an em dash filled postmortem. No shame.
This is your reminder to set up canary tokens: https://canarytokens.org/nest/
I had a token I set up 3 years ago for AWS that I hadn't used. I was recently doing something with Claude and was asking it to interact with our AWS dev environment. I was watching it pretty closely and saw it start to struggle (I forget what exactly was going on), and I was >50% likely it was going to hit my canary token. Sure enough, a few minutes later it did and I got an email. Part of why I let it continue to cook was that I hadn't tested my canary in ~3 years.
These stories make me rethink my approach to infra. I would never run AI with prod access, but my manager definitely has a way to obtain prod tokens if he really wanted to. Or if AI agent on his behalf wanted do. He loves AI and nowadays 80% of his messages were clearly made by AI. Sometimes I wonder if he's replaced by AI. And I can't stop them. So probably need to double down on backups and immutability...
Think of AI just like of a genius 16-year old. Accidents will happen - only let AI and the 16-year old access systems where you are sure you have a recovery plan.
It's actually interesting to me that the author is surprised the agent could make an API call and one of those API calls could be deleting the production database.
It's a sad story but at the same time it's clearly showing that people don't know how agents work, they just want to "use it".
> enumerating the specific safety rules it had violated.
That's not how safety works at all. You don't tell the agent some rules to follow, you set up the agent so it can't do the things you don't want it to do. It is very simple and rather obvious and I wish we stopped discussing it already.
That's very unfortunate. How did it have access to the production DB in the first place?
I'm thinking twice about running Claude in an easily violated docker sandbox (weak restrictions because I want to use NVIDIA nsight with it.) At this stage, at least, I'd never give it explicit access to anything I cared about it destroying.
Even if someone gets them to reliably follow instructions, no one's figured out how to secure them against prompt injection, as far as I know.
This isn't the marketing flex you think it is.
Measure twice, cut once.
I previously worked at a managed database as a service company. On more than one occasion during my time there, a junior engineer deleted a customers database and at least one time one of our most senior dbas made it unrecoverable. Never got such straight forward confessions out of them.
Seems like this guy blames everyone except himself for trusting this stuff in the first place. Here's what Cursor did wrong. Here's what railway did wrong. How about yourself?
Hi. Don't give your agents destructive access to your production databases or infrastructure. You can it tools to use, let it write queries and read logs if you want. You don't need to give it "delete company" privileges.
Think about it the positives. With any luck, we will soon have a report of deleted surveillance dataset.
It's definitely the fault of the operator. But also how many times has an AI deleted or modified files it was told not to touch? (and then lied about doing so?
How have they not solved this permissions problem? If the AI is operating on a database it should be using creds that don't have DELETE permissions.
Or just don't use a tool like AI that can be relied on.
Absolutely zero sympathy. You’re responsible for anything an agent you instructed does. Allowing it to run independently is on you (and all the others doing exactly this). This is only going to become more and more common.
Yeah, this is what your agents do even before someone tries to trick them into doing something stupid.
Remember this: these things follow instructions so poorly that they nuke everything without anyone even trying to break the prompt. Imagine how easily someone could break the prompt if the agent ever gets given user input.
The agent didn't delete their production database. They deleted their production database. The agent was just the tool they used to do it.
> Because Railway stores volume-level backups in the same volume
Anyone familiar with Railway no why this is done this way? This seems glaringly bad on its face.
The biggest rule-break was done, not by the agent or infra company, but by the person who gave such elevated authorization (API key) to an autonomous bot.
I think the root cause is not AI, but
1. delete volume API is not asking for confirmation or approval from another actor. Looks like we have no guardrails on the delete api.
2. Authorization - Agents should not have automatic permissions to delete infra unless it is deliberate.
it's still hilarious to me that people give agents such privileges and let them run without supervision
it's also hilarious to see the human LARP as if the LLM had guilt or accountability, therapeutically shouting at a piece software as if it weren't his own fault that the LLM deleted the whole volume and its backups, or his obvious lack of basic knowledge of the systems he's using
I don’t see the problem here. These people will be pushed out of the industry quickly and their business taken by other people, who are using agents, but are smart enough to run them sandboxed without any permission to production or even dev data/systems.
The post overall is interesting, but this:
> A single API call deletes a production volume. There is no "type DELETE to confirm." There is no "this volume is in use by a service named [X], are you sure?" There is no rate-limit or destructive-operation cooldown.
...makes me question the author's technical competence.
Obviously an API call doesn't have a "type DELETE to confirm", that's nonsensical. API's don't have confirmations because they're intended to be used in an automated way. Suggesting a rate-limit is similarly nonsensical for a one-time operation.
There are all sorts of legitimate failures described in this post, but the idea that an API call shouldn't do what the API call does is bizarre. It's an API, not a user interface.
I'm not familiar with Cursor, does it allow the agent to have access to run "curl -X POST" with no approval, i.e. a popup will show up asking you to approve/deny/always approve? AFAIK with Claude Code, this can only happen if you use something like "--dangerously-skip-permissions". I have never used this, I manually approve all commands my agent runs. Pretty insane that people are giving agents to do whatever it wants and trusting the guardrails will work 100% of the time.
Example from my own project agent log from the time it destroyed his database :
https://github.com/GistNoesis/Shoggoth.dbExamples/blob/main/...
Project Main repo : https://github.com/GistNoesis/Shoggoth.db/
This is a classic anchoring failure. The LLM read the request, framed the risk space ("looks like cleanup is needed"), and the human didn't challenge that framing before it acted.
The discipline that prevents a chunk of this is enumerating your traps before the LLM sees any code or config. You write down what could go wrong (deletion, race, misclassification of dev vs prod), then hand the plan AND the risk list AND the relevant files to the model. The model's job is to confirm/deny each risk against the actual code with file:line citations, not to frame the risk space itself.
Pre-implementation. Anchoring defense. The opposite of "vibe coding."
That happens if you aggressively buy into the latest tech without thinking about if you really need it.
Why do you need an AI agent for working on a routine task in your staging environment?
"Never send a machine to do a human's job."
When I first started using Claude, one of my fist big projects was tightening up my backups and planning around recovery. It's more or less inevitable if you're opening up permissions wide enough to do this without your explicit OK
Put infra deletion locks on your prod DBs right now, irrespective of whether you use agents. This was a well established practice before agents because humans can also make mistakes (but obviously not as frequently as we're seeing with agents).
If you do use agents then you should be able to ban related CLI commands in your repo. I upsert locks in CI after TF apply, meaning unlocks only survive a single deployment and there's no forgetting to reapply them.
It's also the API design of many IaaS/SaaS providers. It's often extremely hard to limit tokens to the right scope, if even possible.
Most access tokens should not allow deleting backups. Or if they do, those backups should stay in some staging area for a few days by default. People rarely want to delete their backups at all. It might be even better to not provide the option to delete backups at all and always keep them until the retention period expired.
Execution layer security must be deterministic. That's why we are working on AgentSH (https://www.agentsh.org) which is model, framework and harness agnostic.
The real issue is no actual backups.
Yeah. I've seen this happen with people doing it. It's just bad access management.
And anyone can do it with the wrong access granted at the wrong moment in time...even Sr. Devs.
At least this one won't weight on any person's conscience. The AI just shrugs it off.
Giving agents direct access to devops? Idk man, that's quite the bleeding edge. I mean how hard is it to retain the most important procedures as manual steps?
If we must have GasTown/City/Metropolis then at least get an agent to examine and block potentially harmful commands your principal agent is about to run.
I'm actually surprised that at the scale that AI is being used, we haven't seen more of this - or worse.
The same thing can happen in development. Data exfiltration or local file removals are often downplayed; I wonder why nobody talks about the lethal trifecta anymore.
PocketOS's website says "Service Disruption: We're currently experiencing a major outage caused by an infrastructure incident at one of our service providers. We are actively working with their team on recovery. Next update by 10:00a pst."
This is wrong. It was not an infra incident at their service provider.
As Jer says in the article, their own tooling initiated the outage. And now they're threatening to sue? "We've contacted legal counsel. We are documenting everything."
It is absolutely incredible that Jer had this outage due to bad AI infra, wrote the writeup with AI, and posted on Twitter and here on his own account.
As somebody at PocketOS instructed their AI in the article: "NEVER **ing GUESS!" with regards to access keys that can touch your production services. And use 3-2-1 backups.
Good luck to the rental car agencies as they are scrambling to resume operations.
I never adopted Opus 4.6 because it was too prone to doing things on its own. Anthropic called it "a bias towards action". I think 4.5 and 4.7 are much better in this regard. I'm not saying they are immune to this kind of thing though.
I’m sorry to be harsh but this is 100% your fault, and attempting to shift the blame onto Cursor and Railway just doesn’t fly.
The onus is on you to make sure your system uses the APIs in a way that’s right for your business. You didn’t. You used a non-deterministic system to drive an API that has destructive potential. I appreciate that you didn’t expect it to do what it did but that’s just naivety.
You’re reaping what you sowed.
Best of luck with the recovery. I hope your business survives to learn this lesson.
This has to be fake right?
Using LLMs for production systems without a sandbox environment?
Having a bulk volume destroy endpoint without an ENV check?
Somehow blaming Cursor for any of this rather than either of the above?
Why does your agent have permission to delete production database?
If you think your AI “confessed,” that’s your problem right there.
From the category of "never run complex dd while drinking beer"
It seems like the most unreasonable thing happening here is Railway's backup model and lack of scoped tokens. On the agent side of things, how would one prevent this, short of manually approving all terminal commands? I still do this, but most people who use agents would probably consider this arcane.
(Let's suppose the agent did need an API token to e.g. read data).
And we're still relatively early...
Batten down the hatches, folks.
The author posted their own confession right here: https://pbs.twimg.com/profile_banners/591273520/1719711719/1...