Cool! I'd love to know a bit more about the replication setup. I'm guessing they are doing async replication.
> We added nearly 50 read replicas, while keeping replication lag near zero
I wonder what those replication lag numbers are exactly and how they deal with stragglers. It seems likely that at any given moment at least one of the 50 read replicas may be lagging cuz CPU/mem usage spike. Then presumably that would slow down the primary since it has to wait for the TCP acks before sending more of the WAL.
> would slow down the primary since it has to wait for the TCP acks
Other than keeping around more WAL segments not sure why it would slow down the primary?