Hello! Railway founder here We'll have a post mortem for this one as we always write post mor...

justjake • yesterday at 4:58 PM • 3 replies • view on HN

Hello! Railway founder here

We'll have a post mortem for this one as we always write post mortems for anything that affects users

Our initial investigation reveals this affects <3% of instances

Apologies from myself + the Team. Any amount of downtime is completely unacceptable

You may monitor this incident here: https://status.railway.com/cmli5y9xt056zsdts5ngslbmp

Replies

vintagedave • yesterday at 5:12 PM

Hi Jake. Appreciate your presence here on HN.

This affected a seemingly random set of services across three of my accounts (pro and hobby, depending on if this is for work or just myself.) That ranges from Wordpress to static site hosting to a custom Python server. All of the deployments showed as Online, even after receiving a SIGTERM.

While 3% is 'good', that's an awfully wide range of things across multiple accounts for me, so it doesn't feel like 3% ;) Please publish the post mortem. I am a big fan of Railway but have really struggled with the amount of issues recently. You don't want to get Github's growing rep. Some people are already requesting I move one key service away, since this is not the first issue.

Finally, can I make a request re communication:

> If you are experiencing issues with your deployment, please attempt a re-deploy.

Why can't Railway restart or redeploy any affected service? This _sounds_ like you're requiring 3% of your users to manually fix the issue. I don't know if that's a communication problem or the actual solution, but I certainly had to do it manually, server by server.

➕ show 1 reply

port3000 • yesterday at 6:02 PM

Second complete outage on railway in 2 months for us (there was also a total outage on December 16th), and many issues with stuck builds and other minor issues in the months before that.

Looking to move. It's a bit of hassle to setup coolify and Hetzner but I have lost all trust.

iJohnDoe • yesterday at 6:11 PM

Many questions on their forum are similar to our situation. People wondering if they should restart their containers to get things working again. Worried about if they should do anything, risk losing data if they do anything, or just give everything more time.

➕ show 1 reply

alt Hacker News

Replies