It looks like the aggregate stats are more of a venn diagram than an average. So if 1/N service...

verdverm • yesterday at 7:56 PM • 5 replies • view on HN

It looks like the aggregate stats are more of a venn diagram than an average. So if 1/N services are down, the aggregate is considered down. I don't think this is an accurate way to calculate this. It should be weighted or in some way show partial outages. This belief is derived from the Google SRE book, in particular chapters 3 (embracing risk) and 4 (service level objectives)

https://sre.google/sre-book/embracing-risk/

https://sre.google/sre-book/service-level-objectives/

Replies

ablob • yesterday at 8:33 PM

If you're using all services, then any partial outage is essentially a full outage. Of course, you can massage the numbers to make it look nicer in the way you described but the conservative approach is better for the customers. If you insist, one could create this metric for selected services only to "better reflect users".

That being said, even when looking at the split uptimes, you'd have to do a very skewed weighting to achieve a number with more than one 9.

➕ show 1 reply

marcosdumay • yesterday at 8:34 PM

That's how you count uptime. You system is not up if it keeps failing when the user does some thing.

The problem here is the specification of what the system is. It's a bit unfair to call GH a single service, but it's how Microsoft sells it.

➕ show 2 replies

bandrami • today at 1:51 AM

Thinking back to when I was hosting, I think telling a customer "your web server was running fine it's just that the database was down" would not have been received well.

mort96 • yesterday at 8:27 PM

I mean I think it's useful. It answers the question, "what percentage of the time can I rely on every part of GitHub to work correctly?". The answer seems to be roughly 90% of the time.

➕ show 2 replies

formerly_proven • yesterday at 8:47 PM

In a nutshell, why would the consumer care (for the SLO) care about how the vendor sliced the solution into microservices?

➕ show 1 reply

alt Hacker News

Replies