Yeah, I was thinking about simonw's lethal trifecta[0] and how to solve it and my conclusion was "you cannot", i.e. you just accept a certain level of risks for the rewards it offers.
The "agent never sees keys" approach prevents key exfiltration, but it doesn't prevent agent from nuking what it has access to, nor prevent data exfiltration.
The best advice I heard to protect against prompt injection was "just use Opus" ( ... which was great advice before they lobotomized it ;)
But even without injection, most of the horror stories are from random error, or the AI trying to be helpful (e.g. stealing your keys or working around security restrictions, because they trained or to really want to complete a task.[1])
tl;dr yolo
[0] https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
[1] https://www.reddit.com/r/ClaudeAI/comments/1r186gl/my_agent_...