logoalt Hacker News

pastescreenshotyesterday at 8:05 PM1 replyview on HN

What has worked better for me is splitting authority, not just prompts. One agent can touch app code, one can only write failing tests plus a short bug hypothesis, and one only reviews the diff and test output. Also make test files read only for the coding agent. That cuts out a surprising amount of self-grading behavior.


Replies

huslageyesterday at 9:25 PM

How do you limit access like that?