You have an alert on what users actually care about, like the overall success rate. When it goes off, you check the WARNING log and metric dashboard and see that requests are timing out.
That is a lagging indicator. By the time you're alerted, you've already failed by letting users experience an issue.
That is a lagging indicator. By the time you're alerted, you've already failed by letting users experience an issue.