Claude code (as shown in the repo) can read the files on disk. Isn’t that already exfiltration? In order to read the file, it has to go to Anthropic. I don’t personally have a problem with that but it’s not secret if it leaves your machine.
IMO, you should treat your agent's environment as pre-compromised. In that reading, your goal becomes security-in-depth.
Anthropic is trying to earn developer trust; they have a strong incentive to make sure that private keys and other details that the agent sees do not leak into the training data. But the agent itself is just a glorified autocomplete, and it can get confused and do stupid stuff. So I put it in a transparent prison that it can see out of but can't leave.
That definitely helps with the main failure modes I was worrying about, but it's just one layer. You definitely want to make sure that your production secrets are in an external vault (Hashicorp Vault, Google Secret Store, GitHub secrets, etc) that the agent can't access.
The things that agent is seeing should be dev secrets that maybe could be used as the start of a sophisticated exploit, but not the end of it. There's no such thing as perfect security, only very low probabilities of breach. Adding systems that are very annoying to breach and have little offer when you do greatly lowers the odds.
IMO, you should treat your agent's environment as pre-compromised. In that reading, your goal becomes security-in-depth.
Anthropic is trying to earn developer trust; they have a strong incentive to make sure that private keys and other details that the agent sees do not leak into the training data. But the agent itself is just a glorified autocomplete, and it can get confused and do stupid stuff. So I put it in a transparent prison that it can see out of but can't leave.
That definitely helps with the main failure modes I was worrying about, but it's just one layer. You definitely want to make sure that your production secrets are in an external vault (Hashicorp Vault, Google Secret Store, GitHub secrets, etc) that the agent can't access.
The things that agent is seeing should be dev secrets that maybe could be used as the start of a sophisticated exploit, but not the end of it. There's no such thing as perfect security, only very low probabilities of breach. Adding systems that are very annoying to breach and have little offer when you do greatly lowers the odds.