Not sure I understand, wouldn't permissions prevent this? The user runs with `--dangerously-skip-permissions` so they can expect wild behaviour. They should run with permissions and a ruleset.
Who knows whether permissions would prevent this? Anthropic's documentation on permissions (https://code.claude.com/docs/en/permissions) does not describe how permissions are enforced; a slightly uncharitable reading of "How permissions interact with sandboxing" suggests that they are not really enforced and any prompt injection can circumvent them.
The rules and permissions are no longer program flags, but plain text for the agent to "obey".
You could prevent this even with --dangerously-skip-permissions with a simple pretooluse hook.