>which human The second graph has this under it: The length of tasks (measured by h...

nightshift1 • yesterday at 7:43 AM • 1 reply • view on HN

>which human

The second graph has this under it:

The length of tasks (measured by how long they take human professionals) that generalist frontier model agents can complete autonomously with 50% reliability has been doubling approximately every 7 months for the last 6 years...

Replies

twotwotwo • yesterday at 8:08 AM

Yeah--I wanted a short way to gesture at the subsequent "tasks that are fast for someone but not for you are interesting," and did not mean it as a gotcha on METR, but I should've taken a second longer and pasted what they said rather than doing the "presumably a human competent at the task" handwave that I did.

➕ show 1 reply

alt Hacker News

Replies