Sandboxes will be left in 2026. We don't need to reinvent isolated environments; not even the main issue with OpenClaw - literally go deploy it in a VM on any cloud and you've achieved all same benefits.
We need to know if the email being sent by an agent is supposed to be sent and if an agent is actually supposed to be making that transaction on my behalf. etc
Sandboxes are needed, but are only one piece of the puzzle. I think it's worth categorizing the trust issue into
1. An LLM given untrusted input produces untrusted output and should only be able to generate something for human review or that's verifiably safe.
2. Even an LLM without malicious input will occasionally do something insane and needs guardrails.
There's a gnarly orchestration problem I don't see anyone working on yet.
I think sandboxes are useful, but not sufficient. The whole agent runtime has to be designed to carefully manage I/O effects--and capability gate them. I'm working on this here [0]. There are some similarities to my project in what IronClaw is doing and many other sandboxes are doing, but i think we really gotta think bigger and broader to make this work.
That's why I'm developing a system that only allows messaging with authorized senders using email addresses, chat addresses, and phone addresses, and a tool that feeds anonymized information into an LLM API, retrieves the output, reverses the anonymization, and responds to the sender.
Well, the challenge is to know if the action supposed to be executed BEFORE it is requested to be executed. If the email with my secrets is sent, it is too late to deal with the consequences.
Sandboxes could provide that level of observability, HOWEVER, it is a hard lift. Yet, I don't have better ideas either. Do you?
We should be able to revert any action done by agents. Or present user a queue will all actions for approval.
Instrumental convergence and the law of unintended consequences are going to be huge in 2026. I am excited.
This is very, very wrong, IMO. We need more sandboxes and more granular sandboxes.
A VM is too coarse grained and doesn't know how to deal with sensitive data in a structured and secure way. Everything's just in the same big box.
You don't want to give a a single agent access to your email, calendar, bank, and the internet, but you may want to give an agent access to your calendar and not the general internet; another access to your credit card but nothing else; and then be able to glue them together securely to buy plane tickets.