logoalt Hacker News

wolttamyesterday at 10:38 PM1 replyview on HN

I think the key to making "useful" things is to sandbox the agent and give it read/write access to strictly the data needed for the function. The agent can only talk to preordained services and its input to those services will be treated as untrusted user input.

To be clear: I agree fundamentally that there is no safe way to have agents connected to the world in a way that allows them to take irreversible actions. Deployments where agents can take destructive actions are deployments where the agent will, eventually, take destructive action.


Replies

jcgrilloyesterday at 10:45 PM

Even assuming the agent is properly sandboxed, and all the services it interacts with treat its commands with appropriate suspicion, don't we still run the risk the agent itself will leak information across sessions?

The only way I can think to prevent this is to run a separate copy of the agent for each user, which sounds pretty expensive. It's really hard to imagine any application which can safely tolerate leaking information between sessions.

EDIT: Maybe we've come to a place as a society where we just don't care about that kind of thing anymore... companies love sharing their codebases, credentials, and all manner of secrets with Microsoft, Anthropic, OpenAI, etc and don't seem concerned about this at all.

show 1 reply