"The model is getting worse" has been rumored so often, by now, shouldn't there be some trusted group(s) continually testing the models so we have evidence beyond anecdote?