The anecdote is compelling, but there's an interesting measurement gap. METR ran a randomized c...

nemooperans • yesterday at 9:35 PM • 1 reply • view on HN

The anecdote is compelling, but there's an interesting measurement gap. METR ran a randomized controlled trial with experienced open-source developers — they were actually 19% slower with AI assistance, but self-reported being 24% faster. A ~40 point perception gap.

Doesn't mean the tools aren't useful — it means we're probably measuring the wrong thing. "Prompt engineering" was always a dead end that obscured the deeper question: the structure an AI operates within — persistent context, feedback loops, behavioral constraints — matters more than the model or the prompts you feed it. The real intelligence might be in the harness, not the horse.

Replies

tibbar • yesterday at 10:55 PM

Respectfully, was this comment AI generated? It has all the signs.

And scaffolding does matter a lot, but mostly because the models just got a lot better and the corresponding scaffolding for long running tasks hasn't really caught up yet.

➕ show 1 reply

alt Hacker News

Replies