Did you feed back the results of the tests / benchmark to the model?
I’m presuming you have a very robust test framework / benchmark setup etc?
I’m presuming you fed the model the baseline results of that setup as a starting point ?