The Belmont analogy is great, but the deeper point is even scarier: most of the industry is giving non-deterministic systems direct access to deterministic infrastructure (databases, shells, email, etc).
Historically we spent decades reducing automation privileges and adding layers of verification. Agents seem to be reversing that trend almost overnight.
As long as the penalties for data breach are a slap on the wrist and buying everyone one year of credit monitoring, no one will.
Goes to a lot of trouble to build a mental model / map / landscape of how agentic ops work. Worth the read if you're looking for one, reasonable people know the map is never the terrain.
Anyone know how many data breaches occur on a monthly basis that would require credit monitoring?
> Not only is this pure science fiction at this point, but injecting non-determinism into your defensive layer is terrifying and incredibly stupid. If you use an LLM to evaluate whether another LLM is doing something malicious, you now have two hallucination risks instead of one. You also risk a prompt-injection attack making it all the way to your security layer.
I've found fictional displays of "system compromise" kinda ridiculous in e.g. Halo. Now I know that Cortana throws AI slop input into AI slop infrastructure with thousands of subagents until she's in.
You know how in video games literally everything is super easy to hack?
Turns out all those games were just very forward-thinking.
[dead]
i do https://github.com/npc-worldwide/npcpy
https://arxiv.org/abs/2506.10077 followup paper coming soon which further demonstrates these contextuality results for a suite of models. there is no way to fundamentally impose on the training data or processing effective guardrails that can transcend this reality.