Another great question is "When did you last try to restore from a backup?" which usually is answered with "It's the built-in tooling, why would we assume it's broken?" or similar. Then fast-forward some months/years, and they try to restore from backups only to realize the backups never actually backed up what they cared about.
This famously happened at GitLab: https://about.gitlab.com/blog/2017/02/01/gitlab-dot-com-data...
> Regular backups seem to also only be taken once per 24 hours, though team-member-1 has not yet been able to figure out where they are stored. According to team-member-2 these don’t appear to be working, producing files only a few bytes in size.
We've avoided that in various shops by making backups/restores part of regular maintenance processes. How do we upgrade the database? By stopping it, backing it up, restoring that to the new server, pointing all code at the new DB, then turning off the old server.
As with code deployment, it's not so scary when it's something you do so frequently that it's just a little script you run.
aws-cli will sync your s3 buckets to a local system.
I’m doing that to linux, and then the Linux box is furthermore backed up with nakivo.
Not my favorite but the price was okay and I can run the whole director on Linux, unlike all their other competitors. [veeam’s next major release 13 or 14 should do this in the next year or so too.]
While nakivo backs up s3 buckets, NFS shares, and local file servers… to your point, I don’t trust it (or any other backup software I can’t extract and unpack the resulting backup by hand) as far as I can throw it. So I rsync or mirror it to a local Linux box with aws-cli and then back THAT up.
I think you can do all this with windows stuff too but I don’t know it that well
Additionally you can take servers that are linux vps’es and do the reverse: mirror THEIR content to an s3 bucket.
You can also run minio open source/free on your fileserver and set up s3 to s3 sync. Cloudflare for example will ingest and replicate your minio server automatically and you can firewall it all off to their address ranges. It’s not free but it actually prices out favorably compared to veeam and nakivo if that’s all you need backed up.
A fun one like that, a few years back we had some code using dynamodb that used the automatic point-in-time backups. I asked if it had been tested, need you guess the reply?
Of course it turns out that the restore can only happen to a _new database name_ not the original, and the code had in multiple places hardcoded the assumption of what the db was called.
So restoring also involved patching the code and rolling that out; you can't "roll back" because to roll back the db the code must roll forward.
Agreed, if you haven't tested your backups recently (daily, automatic best), you don't have backups. Several of my clients (CTO Coaching) had problems in the past because they restored backups and where finding they were not complete (for various reasons).
In addition to making sure it works, you should make sure you know how to deal with restoring. Sure running a command is easy but what about spinning up new infra? What if it’s corrupted? What if the one person that knows the setup is gone or asleep? Mainly a problem for smaller teams that don’t have the redundancy or resources - they really need to make sure there’s at least docs on how stuff was setup. Reminds me I need to do my yearly checkup too.
You have to test restoration as part of SOC 2, so most companies with real customers do it at least once a year.
One thing I've never figured out is, what is the difference between backups and replication? And, does restoring from backups always mean losing more _recent_ data than replication?
"Amateurs backup. Professionals restore."
my dad told me about this customer that had a server that made automatic backups each Sunday night. The backup script would backup all the data then eject the tape so the manager could put it in the vault and rotate in the other one from the vault.
When the hard drive failed, they restored the customer to the latest backup. Which was the tape still sitting in the tape drive in the server. It was from the first Sunday night after the system was installed years ago