logoalt Hacker News

probably_wrong10/11/20242 repliesview on HN

Your experience makes me think that the reason the models got a better success rate is not because they are better at reasoning, but rather because the problem made it to their training dataset.


Replies

andrepd10/11/2024

Absolutely! It's the elephant in the room with these ducking "we've solved 80% of maths olympiad problems" claims!

s-macke10/11/2024

We don't know. The paper and the problem was very prominent at that time. Some developers at Anthropic or OpenAI might have included that in some way. Either as test or as a task to improve the CoT via Reinforcement Learning.

show 1 reply