The author speculates about ways to deal with an overloaded queue.
Kingmans Formula says that as you approach 100% utilization, waiting times explode.
The correct way to deal with this is bounded queue lengths and back pressure. I.e don’t deal with an overloaded queue, don’t allow an overloaded queue.
Which is easy to say. I've been trying to debug an overloaded queue for over a week now. (it used to work until I discovered there were some serious race conditions resulting in 1 in a million problems crashes, and every fix for them so far has not fixed things. (at least I can detect it and I'm allowed to toss things from the queue - but the fact is we were handling this before I put the fixes in and people don't like it when I now reject thing from the queue so they want the performance back without the races)