logoalt Hacker News

nevontoday at 9:52 AM1 replyview on HN

How does it deal with partial failures like the upstream being unreachable from one datacenter but not the other, or from one region but not another? Or when the upstream uses anycast or some other way to route to different origins depending on where the caller is?

Making your circuit breaker state global seems like it would just exacerbate the problem. Failures are often partial in the real world.


Replies

rodrigorcstoday at 11:26 AM

Great question. Openfuse has a "systems" concept for exactly this. Each system is an isolated unit within an environment with its own breaker state. So you'd have us-east/stripe and eu-west/stripe as separate breakers. If Stripe is unreachable from us-east but healthy from eu-west, only the us-east breaker trips. The state is coordinated across all instances within a system, not globally across everything. You scope it to match your actual failure domains.