The set of math/logic problems behind AIME 2024 appears to be... https://artofproblemsolving.com/wiki/index.php/2024_AIME_I_P...
Impressive stuff! But unclear to me if it's literally just these 15 or if there's a large problem set...
doesn’t seem too hard to me, shame i was never exposed to this stuff in highschool
e: oh i see, they get progressively harder
The full dataset is here - https://huggingface.co/datasets/AI-MO/aimo-validation-aime you can use the eval script I have in optillm to benchmark on it - https://github.com/codelion/optillm/blob/main/scripts/eval_a...