> If you look at the security measures in other coding agents, they're mostly security theater. As soon as your agent can write code and run code, it's pretty much game over.
At least for Codex, the agent runs commands inside an OS-provided sandbox (Seatbelt on macOS, and other stuff on other platforms). It does not end up "making the agent mostly useless".
My codex just uses python to write files around the sandbox when I ask it to patch a sdk outside its path.
Does Codex randomly decide to disable the sandbox like Claude Code does?
You really shouldn’t be running agents outside of a container. That’s 101.