logoalt Hacker News

eddythompson80yesterday at 5:31 PM0 repliesview on HN

> If you're building an app for yourself to track your own food habits; why does DB, framework, best practices matters?

They don't, it's just annoying as shit when things break at the worst time for lack of these "best practices" and you know that the only answer will be "do better". I'll give you an example. Years ago I migrated a lot of my app usage to selfhosted OSS apps for all the reasons one might list them. I did like 80% of what I perceived as the "important best practices". Setup ZFS with redundancy to handle drive failures, a UPS for power interruption, wireguard for secure access, docker for application and dependencies isolation, etc.

But there were always things I just thought "I should probably do that, but later. This is just for me"

It would be the end of the day, I'm tired and on bed wanting to just chill and watch something on my ipad, and what do you know my plex is down, again.

Why does it go down every few days? Now I need to go get a laptop, ssh into my server, docker logs. See a bunch of exceptions. I don't want to debug it today. Just restart it, ok it works again. Go to bed, start watching.

20 minutes in.. I think it's down again.. wtf? get the laptop again, google the error, something about sqlite db on an NFS share not being very stable. All my ZFS storage is only exposed as NFS and SMB share to another machine.. Ok, just restart and hope it works and I'll deal with it latter.

Forget for a couple of days. I'm with a friend as her place and want to watch again, and fuck me I never fixed the sqlite issue, nevermind lets just watch netflix.

Over the weekend, I'm determined to get this fixed. Move the application folder out of NFS on the local machine SSD. It doesn't have redundancy, but it's ok for now. I'll setup an rsync job to copy it to the NFS share in case the SSD fails. I just want to see if it'll be stable.

Few months pass, and it's been pretty stable until I have a power outage. The UPS was there, but the configuration to notify the OS to shutdown broke a while ago and I didn't notice. Files on ZFS are fine, but the some on the local SSD got corrupted and I didn't notice, including plex database. the rsync job just copied the corrupted file over the "backup" file.

It's late at night again, and I just want to relax and watch something and discover this happened. I could try to figure out how to recover it, but it's probably easier to just do a clean scan. It's gonna take hours. Lets just start it and go to sleep.

Later, lets just migrate everything to jellyfin. Have auto upgrade setup because I'm smart. Jellyfin 10.8 updates and unfavorites all the facorited music tracks. "You have backups right". "Well, yes I do. Let me make sure I have an evening cleared so I can setup another instance of jellyfin, run the old backups, export the favorite list, and import it in the new one"... oh there is no way to do that? I guess I can export it to CSV, get a plugin to automate it for me. the plugin hasn't been updated to 10.8 but there is a pull request. ok lets wait. Forget that I setup restic to delete backups older than 30 days. fuck me. I have the CSV somewhere I think. God my `/tmp` is ephemeral and I hope I haven't rebooted since then. phew it's there. fuck me still.

I have worked in managing services for most of my career. I know what I'm doing wrong. I need to setup monitoring, alerts, health checks, 321 backups (not just rsync to a zfs pool) and actually use a backup software that tracks file versions, off site redundancy, dashboards for anomaly detection, scheduled hardware upgrades and checks for memtest, disk health, UPS configuration checks. I know how 3 or 4 9s are achieved in the industry.