Insert before 4: make it generate tests that fail, review, then have it implement and make sure the tests pass.
Insert before that: have it creates tasks with beads and force it to let you review before marking a task complete