This! The safeguards need to be outside LLM and they need to be deterministic.
Now I wish I could reject `git reset --hard` on my local system somehow.
Sounds like you care about data stored on your filesystem! Take one step back and solve that problem. Use a proper isolated sandbox, e.g. Github workspace on an account that is working with a fork.
Care about the data in that workspace? Push it first.
Othwerwise it is a cat and mouse game of whackamole.
Just fork git and patch that out? Can't be that hard just ask the agent for that patch. Don't need to update often either, so it's ok to rebase like twice a year.
You could use a wrapper that parses all the command-line options. Basically you loop over "$@", look for strings starting with '-' and '--', skip those; then look for a non-option argument, store that as a subcommand; then look for for more '-' and '--' options. Once that's all done you have enough to find subcommand "reset", subcommand option "--hard". About 50 lines of shell script.