logoalt Hacker News

ajbyesterday at 9:30 PM1 replyview on HN

In theory, the algorithm could deal with that by choosing the commit at each step, which gives the best expected information gain; divided by expected test time. In most cases it would be more efficient just to cache the compiled output though.


Replies

sfinkyesterday at 10:11 PM

This doesn't sound quite right, but I'm not sure why.

Perhaps: a reasonable objective would be to say that for N bits of information, I would like to pick the test schedule that requires the least total elapsed time. If you have two candidate commits and a slow recompile time, it seems like your algorithm would do many repeats of commit A until the gain in information per run drops below the expected gain from B divided by the recompile time, then it would do many repeats of B, then go back to A, etc. So there are long runs, but you're still switching back and forth. You would get the same number of bits by doing the same number of test runs for each commit, but batching all of the A runs before all of the B runs.

Then again: you wouldn't know how many times to run each in advance, and "run A an infinite number of times, then run B an infinite number of times" is clearly not a winning strategy. Even with a fixed N, I don't think you could figure it out without knowing the results of the runs in advance. So perhaps your algorithm is optimal?

It still feels off. You're normalizing everything to bits/sec and choosing the maximum. But comparing an initial test run divided by the rebuild time vs a subsequent test run divided by a much faster time seems like you're pretending a discrete thing is continuous.

I wish I could math good.

show 2 replies