logoalt Hacker News

lostmsutoday at 5:11 PM1 replyview on HN

Are they ahead of all other recent open models? Is there a leaderboard?


Replies

NitpickLawyertoday at 5:23 PM

There is a leaderboard [1] but we'll have to wait till april for the competition to end to know what models they're using. The current number 3 on there (34/50) has mentioned in discussions that they're using gpt-oss-120b. There were also some scores shared for gpt-oss-20b, in the 25/50 range.

The next "public" model is qwen30b-thinking at 23/50.

Competition is limited to 1 H100 (80GB) and 5h runtime for 50 problems. So larger open models (deepseek, larger qwens) don't fit.

[1] https://www.kaggle.com/competitions/ai-mathematical-olympiad...

show 1 reply