How do you know this? I'm not trying to attack your statement, I am genuinely curious how anyone knows anything about model performance outside of benchmarks that are already in the training set.
using them you kind of get a feeling for skill level and can extrapolate that better than juiced benchmarks.
using them you kind of get a feeling for skill level and can extrapolate that better than juiced benchmarks.