logoalt Hacker News

evaneliaslast Thursday at 2:18 PM0 repliesview on HN

It was absolutely valid advice for its time, but only in the highly specific cases/reasons outlined in the book. The 2nd edition was written by the top Percona folks, who pretty much had more experience scaling databases for large websites than anyone else.

The Prisma answer just does not summarize correctly what the book was saying.

It mainly boiled down to sharding and external caching. Storage and memory were much smaller back then, so there was a lot of sharding and functional partitioning, and major reliance on memcached; all of those are easier if you minimize excessive JOINs.

The query planner in MySQL wasn't great at the time either, and although index hints could help, huge complex queries sometimes performed worse than multiple decomposed simpler queries. But the bigger issue was definitely enabling sharding (cross-shard joins had to be handled at the application level) and enabling external caching (do a simple range scan DB query to get a list of IDs/PKs, then do point lookups in memcached, then finally do point lookups in the DB for any that weren't in memcached).