logoalt Hacker News

wvh05/15/20251 replyview on HN

Two days ago, I'd have said the same. Yesterday, big box went down, and because it was so stable, it was a joint less oiled and the spare chickened out at the wrong time and apparently even managed to mess up the database timeline. Today was the post-mortem, and it was rough.

I'm just saying, simple is nice and fast when it works, until it doesn't. I'm not saying to make everything complex, just to remember life is a survivor's game.


Replies

thehappyfellow05/16/2025

You’re right, there are downsides like turbine you mention! We mitigate it by running a hot backup we can switch to in seconds and a box in which we test restoring backups every 24h, that’s necessary! But it requires 3x the number of big expensive boxes.

I still think it’s the right tradeoff for us, operating a distributed system is also very expensive in terms of dev and ops time, costs are more unpredictable etc.

It’s all tradeoffs, isn’t it?