logoalt Hacker News

scottlambyesterday at 8:56 PM0 repliesview on HN

I agree there's a scale below which this (or any) optimization matters and a scale above which you want your primary key to have locality (in terms of which shard/tablet/... is responsible for the record). But...

* I think there is a wide range in the middle where your database can fit on one machine if you do it well, but it's worth optimizing to use a cheaper machine and/or extend the time until you need to switch to a distributed db. You might hit this middle range soon enough (and/or it might be a painful enough transition) that it's worth thinking about it ahead of time.

* If/when you do switch to a distributed database, you don't always need to rekey everything:

** You can spread existing keys across shards via hashing on lookup or reversing bits. Some databases (e.g. DynamoDB) actually force this.

** Allocating new ids in the old way could be a big problem, but there are ways out. You might be able to switch allocation schemes entirely without clients noticing if your external keys are sufficiently opaque. If you went with UUIDv7 (which addresses some but not all of the article's points), you can just keep using it. If you want to keep using dense(-ish), (mostly-)sequential bigints, you can amortize the latency by reserving blocks at a time.