logoalt Hacker News

bendbrotoday at 8:50 AM0 repliesview on HN

tl;dr The best ~AI's~ LLM's slop asymptote is 10 hours.

Restated, if you let the best LLM chomp on a task for 10 hours, the output becomes slop.

* These tasks are of the type that you spend 1% of your SWE career working on.

* Each task is primed with an essay length prompt.

* You must play needle in the haystack for bugs in 10 hours worth of AI generated slop.

My experience trying AI coding at work and my observations of AI evangelists makes me believe AI coding is exclusively the purview of people who willing to handhold an AI at half pace to achieve the same result while working on software which amounts to greenfield/toy problems.

The danger of LLMs to thought work is enormously overstated and intentionally overhyped. AI : StackOverflow :: StackOverflow : graybeard in basement

It would be cool if AI kills all thought work, but what will actually happen is a undersupply of SWEs and a second golden age of SWE salaries in like 15y.

https://github.com/METR/public-tasks/tree/main