A fairly large category of the flaky CI jobs I see is "dodgy infrastructure". For instance one recurring type for our project is one I just saw fail this afternoon, where a gitlab CI runner tries to clone the git repo from gitlab itself and gets an HTTP 502 error. We've also had issues with "the s390 VM that does CI job running is on an overloaded host, so mostly it's fine but occasionally the VM gets starved of CPU and some of the tests time out".
We do also have some genuinely flaky tests, but it's pretty tempting to hit the big "just retry" button when there's all this flakiness we can't control mixed in there.