logoalt Hacker News

wmftoday at 5:28 AM0 repliesview on HN

They measure the time it takes a human to complete the task. They don't care how long the AI takes (although in practice it's much faster than human). Measuring tokens isn't a good idea because newer models can complete tasks using fewer tokens.