this whole result leans on one assumption he mentions and then sets aside: independent, stateless requests. that's really the load-bearing part. as soon as the c units share mutable state or have to coordinate, the M/M/c model stops applying and the pooling benefit goes with it. you trade it for coordination cost that grows with the number of pairs, not the number of workers. I hit this constantly with multi-agent LLM systems. people add agents expecting load-balancer style scaling and land in the opposite regime, because the work isn't independent, the agents are writing to shared state. so "pooling is cheap" is really "pooling independent work is cheap," and the independent part is where all the benefit actually comes from.