Are there any specifics about how this was trained? Especially when 5.1 is only a month old. I'm a little skeptical of benchmarks these days and wish they put this up on llmarena
edit: noticed 5.2 is ranked in the webdev arena (#2 tied with gemini-3.0-pro), but not yet in text arena (last update 22hrs ago)
Unfortunately there are never any real specifics about how any of their models were trained. It's OpenAI we're talking about after all.
I’m extremely skeptical because of all those articles claiming OpenAI was freaking out about Gemini - now it turns out they just casually had a better model ready to go? I don’t buy it.