We don't have enough data to confirm if it's over or under reporting. This sample size of 1 is enough to prove the data is not perfectly accurate, but it's not enough to prove a skew bias in the data either way.
Oh please, show me a company that has ever over reported their downtime. That's silly.
That's fair. We don't know.
I am making an assumption that if Microsoft saw a lot of false positive outages they would fix that, but might drag their feet if there was an outage that didn't get properly recorded (assuming it's automatic to begin with, it might be that a human needs remember to update it).