logoalt Hacker News

Tiberiumyesterday at 6:35 PM1 replyview on HN

The only table where they showed comparisons against Opus 4.5 and Gemini 3:

https://x.com/OpenAI/status/1999182104362668275

https://i.imgur.com/e0iB8KC.png


Replies

varencyesterday at 6:45 PM

100% on the AIME (assuming its not in the training data) is pretty impressive. I got like 4/15 when I was in HS...

show 1 reply