Old "agged Technological Frontier" but explains a bit the challenge https://www.hbs.edu/faculty/Pages/item.aspx?num=64700 namely... it's hard and the lack of reproducibility (models getting inaccessible to researcher quickly) makes this kind of studies very challenging.
That is an old empirical study. jadenpeterson was talking about some fundamental limitations of LLMs.