This is not a benchmark. They just want to give people the opportunity to try their hand at solving ...

yorwba • today at 4:12 PM • 3 replies • view on HN

This is not a benchmark. They just want to give people the opportunity to try their hand at solving novel questions with AI and see what happens. If an AI company pulls a solution out of their hat that cannot be replicated with the products they make available to ordinary people, that's hardly worth bragging about and in any case it's not the point of the exercise.

Replies

fph • today at 5:03 PM

The authors mention that before publications they tested these questions on Gemini and GPT, so they have been available to the two biggest players already; they have a head start.

➕ show 1 reply

cocoto • today at 4:54 PM

They could solve the problems and train the next models with the answers, as such the future models could “solve” theses.

YeGoblynQueenne • today at 6:30 PM

Hey, sorry, totally out of context but I've always wanted to ask about the username. I keep reading it as "yoruba" in my mind. What does it mean, if I'm not being indiscreet?

➕ show 1 reply

alt Hacker News

Replies