Can anyone help clarify these doubts - I didn't see any information about how different the test/benchmark set is from the training set. It feels like an important gap to not fill in a ML paper. What if there is an overlap between the problems in the test set and the training set?? What is the decontamination strategy of going from LCBv5 to LCBv6 ?