logoalt Hacker News

borensteinyesterday at 4:01 PM1 replyview on HN

Hi, thanks for your feedback! I can see this from a couple of different perspectives.

On the one hand, you're right: those commit messages are proof positive that the security is not perfect. On the other hand, the threat model is that most threats from AI agents stem from human inattention, and that agents powered by hyperscaler models are unlikely to be overtly malicious without an outside attacker.

There are some known limitations of the security model, and they are limitations that I can accept. But I do believe that yolo-cage provides security in depth, and that the security it provides is greater than what is achieved through permission prompts that pop up during agent turns in Claude Code.


Replies