Top solve rate is currently 24% with Opus 4.8... What's a competent human supposed to score?

jonathanleane • today at 3:24 AM • 1 reply • view on HN

lacunary • today at 3:33 AM

presumably whatever the top model uses and then some, since the human can use the model.

I wonder if a model could score higher if it had a human at its disposal?

➕ show 2 replies

alt Hacker News