The only table where they showed comparisons against Opus 4.5 and Gemini 3:
https://x.com/OpenAI/status/1999182104362668275
https://i.imgur.com/e0iB8KC.png
100% on the AIME (assuming its not in the training data) is pretty impressive. I got like 4/15 when I was in HS...
100% on the AIME (assuming its not in the training data) is pretty impressive. I got like 4/15 when I was in HS...