logoalt Hacker News

raldiyesterday at 4:48 PM1 replyview on HN

You have an alert on what users actually care about, like the overall success rate. When it goes off, you check the WARNING log and metric dashboard and see that requests are timing out.


Replies

ImPostingOnHNyesterday at 5:07 PM

That is a lagging indicator. By the time you're alerted, you've already failed by letting users experience an issue.

show 2 replies