> "Localize my app and add the option to change units"
To me this still feels like the wrong way to interact with a coding agent. Does this lead people to success? I've never seen it not go off the rails in some way unless you provide clear boundaries as to what the scope of the expected change is. It's gonna write code if you don't even want it to yet, it's gonna write the test first or the logic first, whichever you don't want it to do. It'll be much too verbose or much too hacky, etc.
I've had no issues with prompts like that. I use Cursor with their plan mode, so I get a nice markdown file to iterate on or edit myself before it actually does anything.
And then
> gh-address-comments address comments
Inspiring stuff. I would love to be the one writing GH comments here. /s
But maybe there's a complementary gh-leave-comments to have it review PRs for you too.
The better models can handle that prompt assuming there is an existing clean codebase and the scope of the task is not too large. The existing code can act as an implicit boundary.
Weaker models give your experience, or when using a 100% LLM codebase I think it can end up in a hall of mirrors.
Now I have an idea to try, have a 2nd LLM processing pass that normalizes the vibe-code to some personal style and standard to break it out of the Stack Overflow snippet maze it can get itself in.