I wonder how they fix things when Claude is down.
Based on this outage: not very well.
This is ( or will be in the future ) a surprisingly relevant issue
maybe they ask a secondary agentic system to fix it. will that be the future of “redundancy”?
"Gemini, fix my Claude infra"
lol probably use their dev or qa Claude environment to fix prod
I would bet that they have inference setup for internal use on a separate system from the customer-facing production environment. The same way telemetry infrastructure needs to be run separate from normal production systems, so you aren't "blind" when you need it most.