The output the agent creates falls into one of these categories:
1. Correct, maintainable changes 2. Correct, not maintable changes 3. Correct diff, maintains expected system interaction 4. Correct diff, breaks system interaction.
In no way are they consistent or deterministic but _always_ convincing they are correct.