Combination of partitioning + sharding perhaps? Often times its is only a handful of tables that gro...

BatteryMountain • today at 6:30 AM • 0 replies • view on HN

Combination of partitioning + sharding perhaps? Often times its is only a handful of tables that grows large, so even less so for a single large customer, thus sharding that customer out and then partitioning the data by a common/natural boundary should get you 90% there. Majority of data can be partitioned, and it doesn't have to be by date - it pays dividends to go sit with the data and reflect what is being stored, its read/write pattern and its overall shape, to determine where to slice the partitions best. Sometimes splitting a wide table into two or three smaller tables can work if your joins aren't too frequent or complex. Can also help if you can determine which of the rows can be considered hot or cold, so you move the colder/hotter data to a separate tables to speed up read/writes. There are always opportunities for storage optimization large datasets but it does take time & careful attention to get it right.

alt Hacker News