Regardless of whether this specific claim is true, enterprises are becoming much more cautious about developer tools that can read large portions of proprietary codebases.
If you're using a coding agent then obviously you need to either serve the model yourself or trust whoever you are sending your data to.
In terms of WHAT you need to be concerned about, it seems it goes far beyond code, and far beyond having to trust your model provider.
A coding agent with access to a bash tool is going to have access to anything that a human with a bash prompt would, and even if you try to provide a nailed down sandbox environment for the agent, you still need to be concerned about things like unencrypted passwords and keys that it may be able to find "laying around" in code or databases/etc it has access to.
I'm surprised there haven't yet been more widely disseminated stories about coding agents and claw-bots wreaking havoc.
After they uploaded their code to private repositories on GitHub, Bitbucket etc since forever?. They trust GitHub not to read their code but they don't trust an AI from Microsoft not to read it? It would be schizophrenia
A bit too late for that, most of them have already dumped most of their codebase and IP into cloud models.
not to mention they are kind of capable of executing code and susceptible to injections which also amounts to being practically backdoors if youre not super careful about how u use the tooling
Becoming? We've moved entirely in the opposite direction.
When these tools first appeared the overwhelming conversation was about the risk of letting a remote tool siphon your code and intellectual property (where eventually they're going to add that to their training). Now everyone is using them, and that fear seems to have dissolved. Every corporation is sprinkled with Claude Code, Antigravity, Copilot, Codex, and so on. Even the long fear-mongered Chinese providers are being heavily used in many spaces.
In this case this is a PR battle between two firms, and it isn't much more. And Alibaba isn't worried about the "proprietary code" (the truth is that there is incredibly little interest in most orgs code), but that the tool is a backdoor, or at least that is the claim.
Wasn't one of the big promises the AI labs made "uncopyrighting"? Ie. the ability to reconstruct large works, including source code, without actual access to the source code? Everything from movies to operating systems.
It's insane that it's becoming a concern now. It should've ended the discussion from the very beginning.