Really cool approach to the containment problem. The insight about "capping the blast radius of a confused agent" resonates - decision fatigue is real when you're constantly approving agent actions.
The exfiltration controls are interesting. Have you thought about extending this to rate limiting and cost controls as well? We've been working on similar problems at keypost.ai - deterministic policy enforcement for MCP tool calls (rate limits, access control, cost caps).
One thing we've found is that the enforcement layer needs to be in-path rather than advisory - agents can be creative about working around soft limits. Curious how you're handling the boundary between "blocked" and "allowed but logged"?
Great work shipping this - the agent security space needs more practical tools.
“Decision fatigue” I don’t want to decide how to respond to this
Thank you! Rate limits are an interesting topic with Claude Code right now. The Max subscription has them, and the API does not; but the Max subscription is an all-you-can-eat buffet, and the API is not. yolo-cage was built to be compatible with the Max subscription while remaining TOS-compliant because it wraps the official CLI with no modifications. That has so far meant for me that the experience is _too_ rate limited.
That said, I have not yet started playing with MCP servers. I suspect that they are completely broken inside yolo-cage right now, as they almost certainly get stopped by the proxy.