You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.
If you don't do that, the agent will be able to incidentally upload them. What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.
And so, the only solution is to make it so the codex process is unable to access those files, hence using a container, or unix permissions, or deleting the files. Which you can already do.
I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
Just be aware that AI agents will explore alternate means of accessing said files: https://news.ycombinator.com/item?id=48348578
> people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly
You could always use setuid to allow the agent to run designated commands whose operation depends on the files, without the agent itself being able to access the files.
While this is true, there is also a layer in the harness between the output of _any_ tool output (eg stdout or hand-rolled tools) and the LLM. A tool could read the file but then the agentic harness could redact the output before returning it back to the llm if any of the contents matched the file contents. We do something similar in Plotly Studio where we check the entropy of strings in the user input and flag & redact any high entropy strings to the user as “potential credentials” thay the user might have inadvertently copied and pasted into the prompt before sending to the llm.
There are ways around this - the llm can always be clever by invoking tools to read the file contents in a different way than the direct file contents - but this is all to say that the agentic harness layer _does_ allow for deterministic logic in between tool output and the LLM requests.
Sandboxing is a solved problem, there are dozens of providers of firecracker instances to run your agent in.
The problem to be solved is how do you define task-specific least privilege versions of your coding agent.
If you're not sandboxing your agent, everything on your computer is waiting to be exposed.
Assuming that file permissions will save you is naively dangerous.
> I imagine this isn't resolved primarily because people expect it to apply to bash tool use, not just the "read" and "edit" tools, and people also expect those files to still be accessible i.e. if the agent invokes "make", which makes it impossible to solve perfectly.
Also, why would they add a feature to prevent data collection, if the data makes the company even more valuable and you might even get good deals from the current government if you provide the access for this data?
Yes, this was solved decades ago. How do you stop a human from reading one of your files?
chmod 600> You can do this now: change the file permissions such that the user you run codex as can't read them, or run codex in a container without those files mounted.
That's quite inconvenient. I want to run my coding agent in a restricted version of my regular user context, not something that drives like a separate machine.
> What if the model runs "rg foo", and one of those files contains the string "foo"? It uploads the tool output, which includes the file contents.
You have codex run rg in the sandbox, and the sandbox can't read foo. Why is this model so difficult to understand? Codex already runs a variety of commands under a bwrap/seatbelt/etc. sandbox. I've merely extended Codex to run everything in a sandbox. Escalation isn't a matter of whether to run a command in a sandbox or not: it's a matter of which sandbox policy to apply to whatever it is the model asked to do.
> the only solution is to make it so the codex process is unable to access those files
That's not true. Restrictions need apply only to the tools the model runs, not the Codex process itself. You can always insert a process-and-sandbox boundary between the harness and its tools. Codex inserts this boundary most of the time anyway. I've extended my Codex to do it all the time, even for things like the read-a-file tool.
Works fine.
> I imagine this isn't resolved primarily because people expect it to apply to bash tool use,
Yeah? Applying it to the shell tool [1] is trivial. It's actually harder to apply the sandbox to non-shell tools. It just isn't hard conceptually: you define a sandbox policy, writing down what's allowed and not, and just filter everything the model does through this policy via OS-level lightweight sandboxing tools.
Seriously. It's not that hard. And you don't have to sandbox the Codex process itself. I honestly have no idea why people think it's necessary to do so. The model has no ability to make Codex-the-POSIX-process do arbitrary things.
[1] I refuse to call it the "bash tool" when most users are running zsh in it. Name things appropriately.
[dead]
100% this. The idea that Codex should enforce this is putting the security boundary at the wrong layer. If you don’t want codes to access something, make it so it doesn’t have access.