Shouldn’t there be a lot of skepticism here?
All the problems they claim to have solved are on are the Internet and they explicitly say they crawled them. They do not mention doing any benchmark decontamination or excluding 2024/2025 competition problems from training.
IIRC correctly OpenAI/Google did not have access to the 2025 problems before testing their experimental math models.