The current way benchmarks are done and are accepted by the community makes for really uninspired work. Until we're willing to break out of this rigid evaluation format prone to crazy overfitting and gaming, talent will move elsewhere. It is kind of a chicken and egg problem though.