I'm honestly having trouble understanding all the benefits and drawbacks of the different agents, specifically around what I want to permit for permissions.
My solution has been to create a new VM which inherits a Claude cli and Gemini CLI pre installed.
That way I can configure at a host level all the permissions I want and it is less likely the agent will access full sets of files and even worse delete things. I know this limits what I can do, but I am exhausted my understanding and auditing the different options for each agent.
I can install a new agent on that VM and then try it, but it is hard to justify the effort to test each one.
What am I getting from your tool for example? Worktree support is somewhat common, right? Does this give me multi agent support that Gemini and Claude do not, does that mean collaboration across team members? Is your approach better, or safer, than what I'm doing? How do I verify those claims?
Can I use your tool with local models like gemma 4 and ollama/llama.cpp: I have 3 24gb Nvidia cards and would like to try a three agent approach, one to write the code, one to write tests, one to architect. I obviously can't use local models with Gemini and Claude cli.
I'm just riffing on my concerns, and thanks for listening.
I think your concerns are valid and echo a lot of what I've heard from others experiencing the same uncertainties.
--- RE: Sandboxing and Permissions ---
First, make sure you know the Lethal Trifecta: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/
If you run a coding agent with full yolo permissions on your machine, there are two major problems: 1. unrestricted internet access is a vector for prompt injection and code/data exfiltration 2. other stuff on your machine that you don't want your agent to access or modify
Most coding agent harnesses went for the "low friction" sandboxing approach and used Seatbelt on Mac. This doesn't really work well in practice because you can't allowlist certain safe domains (so its either all internet or no internet) and it's really tricky to allowlist certain locations on disk (agents ideally need to be able to install system packages, work with mobile simulators, etc and a lot of that stuff is on disk outside of your workspace).
So our solution to this looks a lot like yours: give your agents a container and a network policy and then let them yolo. Per your container policy, they won't be able to access anything unsafe on your disk or internet, except what you narrowly allow.
This is not only a cleaner sandbox model, but it allows you to give them more autonomy instead of letting them pause on each command to run.
Your VM solution is definitely doing the right idea as well. The difference with ctx is that we automatically manage a lot of the VM complexity including elastic memory.
--- RE: Worktrees, Multi-Agent, Collaboration ---
Yes, worktree support is common now. The thing you mention about multi-agent support and collaboration across team members is spot on. All of your agent transcripts are stored in a unified format locally, so your conversations with Claude Code look exactly like your conversations with Gemini. So if your teammate uses one and you use another, the idea is that they can see your work equivalently.
Another interesting concept is that multi-agent support is agent harness agnostic. So you can have a Claude Code primary agent invoke a Gemini subagent.
--- RE: Local Models ---
We don't set anything up specifically for this, but any agent harness that already works with local models will work the same in ctx. I think Codex or OpenCode are both fairly easy to use with local models, whereas Gemini and Claude Code are harder to set up this way. But if you try it, I'd be interested to hear how it goes for you.