So it's railways and the AI's fault, meanwhile your backups are 3 months old?
> Our most recent recoverable backup was three months old.
I'm sorry, but I expect you guys to be writing your precious backups to magnetic tape every day and hiding them in a vault somewhere so they don't catch fire.
AIs are doing a great job of exposing human incompetence.
Dangerously skip permission is the goat, until it isn’t. I’ve seen so many engineers shrug when asked about how they handle permission with CC. Everyone should read for Black Swan, especially the Casino anecdote.
People seem to think prompt injection is the only risk. All it takes is one (1) BIG mistake and you’re totally fucked. The space of possible fuck-up vectors is infinite with AI.
Glad this is on the fail wall, hope you get back on track!
Well, another confirmation that security policies, release strategies, and guardrails, which before used to prevent accidents like “Our junior developer dropped the prod database,” still need to be used as agents aren’t any magical solutions for everything, aren’t the smartest AI that knows everything and knows even more than it had in context. Rules are the same for everyone, not only humans here.
I’m a little confused. Pocket is outsourced to railway, which ended up deleting their data ?
I do find the author to be completely negligent , unless railway has completely lied about the safety in their product.
My first reaction to these kinds of outcomes is always: what did you expect?
Because whatever it was it was disconnected from the reality.
Why does your agents have permissions to delete production database?
I smell BS.
The agent’s “confession”:
> …found a non-destructive solution.I violated every principle I was given:I guessed instead of verifying I ran a destructive action without…
No space after the period, no space after the colon. I’ve never seen an LLM do this.
This is the system working as intended. If a single actor (human or machine) can wipe out your database and backups with no recourse, then, simply put, you had no business serving customers or even existing as a business entity.
The world is never short of idiots. Will be fun to watch when personal finances will be managed by swarm of agents with direct access to operations.
The details of the story are interesting. Backups stored on the same volume is an interesting glitch to avoid. Finding necessary secrets wherever they happen to be and going ahead with that is the kind of mistake I've seen motivated but misguided juniors make. Strange how generated code seems to have many security failings, but generated security checks find that sort of thing.
I'm wondering how much of this is triggered by the "... and don't tell the user" part of the harness injection to outgoing prompts.
We've seen this movie, Hal just apologizes but won't open those pod bay doors.
I'm sorry this happened to you, but your data is gone. Ultimately, your agents are your responsibility.
What happened to the new HN rule of no LLM posts? Isn’t this just a tweet pointing to AI slop?
What was the rationale for giving a non-deterministic AI access to prod in any shape or form?
Amazing this guy admits to such incompetence.
AI didn't do anything wrong.
The management of this company is solely to blame.
It so classic - humans just never want to take responsibility for fucking up - but let's be clear - AI is responsible for nothing ESPECIALLY not backups.
Learn to code yourself, stop using slop generators, then shit like this doesn't happen.
Any company who lets an AI agent touch their production database (or any other part), deserves what they get.
I only spent a few seconds reading this. These are off-the-cuff comments.
The model used is the most important part of the story.
Why is Cursor being mentioned at all? Doesn’t seem fair to Cursor.
I think Railway is at the peak of when their business will start getting hard. They’ve had great fun building something cool and people are using it. Now comes the hard part when people are running production workloads. It’s no longer a “basement self-hosting” business. They’ve had stability issues lately. Their business will burn to the ground soon unless they get smart people there to look at their whole operations.
It boggles the mind that people are given agents unfiltered access to the network.
Me, after sustaining a concussion while attempting a sick backflip move at the top of my stairs:
> We’ve contacted legal counsel. We are documenting everything.
Idiots
“I played with fire and got burnt.”
Good.
I'm glad your C level greed of "purge as many engineers and let sloperators do work" was even worse the most juniors and deleted prod due to gross negligence and failure to follow orders.
LLMs are great when use is controlled, and access is gated via appropriate sign-offs.
But I'm glad you're another "LOL prod deleted" casualty. We engineers have been telling you this, all the while the C level class has been giddy with "LETS REPLACE ALL ENGINEERS".
Honestly, deserved. This post bitching about AI was itself written by AI. So many tells of LLM writing.
Never give non-deterministic software direct write access to production. I am not sure how Railway handles permissions, but scoped access tokens and a fully isolated production environment with very strict access should be the default.
I use AI to help me code and write tests. Why on earth would I allow it to have any access to my production database? It's just not possible. I don't want AI--or me!--to make a mistake in production. That's why we stage things, test them, and then roll. And our production server has backups--that we test regularly.
ooh, given the poster's entire business is at risk here, he probably should have hired a PR firm. this tweet reflects quite poorly on them.
C'mon, AI agent didn't kill human/s/ity (yet), right?
"Man sticks hand in fire, discovers fire is hot"
What the heck is a “credential mismatch”?
Live by the slop, die by the slop. This is natural selection at work.
Ah? Running random code on a machine that can potentially delete production data is a fucking stupid idea.
Sorry to be that guy, but: LLMs agents are experimental by this point. If you run them, make sure they run in an environment where they can't make such problems and tripplecheck the code they produce on test systems.
That is due diligence. Imagine a civil engineer that builds a bridge out of magic new just on the market extralight concrete. Without tests. And then the bridge collapses. Yeah, don't be that person. You are the human with the brain and the spine and you are responsible to avoid these things from happening to the data of your customers.
Also: just restore the backup? Or do we not have a backup? If so, there is really no mercy. Backups are the bare minimum since decades now.
Cool story, SEO bro.
"NEVER FUCKING GUESS!" "NEVER run destructive/irreversible git commands (like push --force, hard reset, etc) unless the user explicitly requests them."
I can't help but laugh reading this. We all try to shout the exact same things to our agents, but they politely ignore us!
"NEVER FUCKING GUESS!"
"This is the agent on the record, in writing."
"Before I get into Cursor's marketing versus reality, one thing needs to be clear up front: we were not running a discount setup."
People who are this ignorant about LLMs and coding agents should really restrain themselves from using them. At least on anything not air gapped. Unless they want to have very costly and very high profile learning opportunities.
Fortunately his conclusions from the event are all good.
If he added "Make no mistakes" none of that would have happened. Clear skill issue.
Ahaha deserved, and it’s also railway, the company who’s CEO brags about spending $300,000 each month on Claude and says programmers are cooked.
Hahahaha I hope it keeps happening. In fact, I hope it gets worse.
Someone trusted prod database to an llm and db got deleted.
This person should never be trusted with computers ever again for being illiterate
AI slop strikes again.
>The agent itself enumerates the safety rules it was given and admits to violating every one. This is not me speculating about agent failure modes. This is the agent on the record, in writing.
Yeah, sorry. Computers can't be held responsible and I'm sure your software license has a zero liability clause. Have fun explaining how it's not your fault to your customers.
"We ran an unsupervised AI agent and gave it access to our entire business"
"NEVER FUCKING GUESS!"
He is claiming this came from the LLM? WTF?
>Railway's failures (plural)
>This is not the first time Cursor's safety has failed catastrophically.
How can you lack so much self awareness and be so obtuse.
There's no section "Mistakes we've made" and "changes we need to make"
1. Using an llm so much that you run into these 0.001% failure modes. 2. Leaking an API key to an unauthroized LLM agent (Focus on the agent finding the key? Or on yourself for making that API key accessible to them? What am I saying, in all likelihood the LLM committed that API key to the repo lol) 3. Using an architecture that allows this to happen. Wtf is railway? Is it like a package of actually robust technologies but with a simple to use layer? So even that was too hard to use so you put a hat on a hat?
Matthew 7:3 “Why do you look at the speck of sawdust in your brother’s eye and pay no attention to the plank in your own eye?."
if your prod DB can be nuked with a single curl command, you are the problem
This is the stupidest thing I've read for months, which is wild with the Trump admin and all the AI hype.
Not only do they blame all of this on a stupid tool, but they also clearly couldn't even write this themselves. This is so obviously written by an LLM. Then there's the moronic notion of having the LLM explain itself.
Was the goal of this post to sabotage the business? Because I can barely come up with anything dumber than this post. Nobody with a brain and basic understanding of computers and LLMs would trust this person after this.
PS: "Confirm deletion" on an api call??? Lol. How vehemently it is argued in spite of how dumb that is is a typical example of someone badgering the LLM until it agrees. You can get them to take any position as long as you get mad enough.
[flagged]
I see the author takes no responsibility