Right. If this were truly a pure scaling issue, I’d expect the interface would offer an archive.is-esque “Claude is at capacity; your prompt is #XXX/YYY in the queue; estimated time remaining: ZZZ seconds”
Instead, the whole system just shits the bed, catastrophically.
Except these models are not run prompt-to-prompt. The infra has to hold the entire context.
But such messages would suggest that Claude has engineered limits, which isn't what the market wants to hear. Completely falling over and being unavailable is just another Tuesday on the internet, will be forgotten by the weekend.