You need a harness, yes, and you need quality gates the agent can't mess with, and that just ki...

vidarh • today at 11:51 AM • 1 reply • view on HN

You need a harness, yes, and you need quality gates the agent can't mess with, and that just kicks the work back with a stern message to fix the problems. Otherwise you're wasting your time reviewing incomplete work.

Replies

irthomasthomas • today at 12:43 PM

Here is an example where the prompt was only a few hundred tokens and the output reasoning chain was correct, but the actual function call was wrong https://x.com/xundecidability/status/2005647216741105962?s=2...

alt Hacker News

Replies