This is the confused deputy problem at the application layer. Sandboxing secures the environment, bu...

clarity_hacker • yesterday at 5:28 PM • 0 replies • view on HN

This is the confused deputy problem at the application layer. Sandboxing secures the environment, but if the agent has legitimate access to sensitive operations (email, database writes, API calls), prompt injection attacks work through approved channels. The only hard defense is explicit user confirmation for each action, which defeats the point of autonomy.

alt Hacker News