git bisect works great for tracking down regressions, but relies on the bug presenting deterministically. But what if the bug is non-deterministic? Or worse, your behaviour was always non-deterministic, but something has changed, e.g. your tests went from somewhat flaky to very flaky.
In addition to the repo linked in the title, I also wrote up a little bit of the math behind it here: https://hauntsaninja.github.io/git_bayesect.html
Nice! I implemented a similar thing a while back: https://github.com/Ealdwulf/BBChop
I'm going to have to check out how you got linear time with Shannon entropy, because I used Renyi entropy to do that, to make the algebra easier.
It's also possible to do it over the DAG, rather than a linear history - although that makes the code a lot more complicated. Unfortunately there doesn't seem to be a linear time cumulative sum algorithm over dags, so it's super linear in some cases.
This is really cool! Is there an alternative way of thinking about it involving a hidden markov model, looking for a change in value of an unknown latent P(fail)? Or does your approach end up being similar to whatever the appropriate Bayesian approach to the HMM would be?
My team doesn’t always have cleanly bisectable branches being merged to main —- it’s not uncommon to see “fix syntax error” types of commits.
But, to merge we need to have all tests pass. (If tests flakily pass then we get new flakey tests, yay!)
I know git-bisect doesn’t support this: but could git-bayesect have an option to only consider merge commits? Being able to track a flake change back to an individual PR would be really useful.