logoalt Hacker News

Esophagus4yesterday at 10:32 PM0 repliesview on HN

I think for me, it’s not so much an objective success metric as it is showing its progression over time.

That’s what marvels me is how fast LLMs are progressing. And it still feels like early days (!).

For methodology, I would check out the METR website though, they’ve published their results.