How's the reproducibility of the results? Like avg score of 10 runs vs original.
Author here: The code is up on GitHub.
The probes I used seem to help identify good configurations, but are quite noisey. A small probe set was initially used to make the scan tractable, and then the higher ranked models were retested on a set ~10x larger.
Author here: The code is up on GitHub.
The probes I used seem to help identify good configurations, but are quite noisey. A small probe set was initially used to make the scan tractable, and then the higher ranked models were retested on a set ~10x larger.