What are the limits of this? Could you replicate Gemini CLI in the browser but with better ux to support non Agentic coding use cases?
Could this be used with arbitrary local tools as well? I could be missing something but I don't see how you could use a non remote MCP server with this setup.
We never say that it isn't. There is a reason Google developed NaCl in the first place that inspired WebAssembly to become the ultimate sandbox standard. Not only that, DOM, JS and CSS also serves as a sandbox of rendering standard, and the capability based design is also seen throughout many browsers even starting with the Netscape Navigator.
Locking down features to have a unified experience is what a browser should do, after all, no matter the performance. Of course there are various vendors who tried to break this by introducing platform specific stuff, but that's also why IE, and later Edge (non-chrome) died a horrible death
There are external sandbox escapes such as Adobe Flash, ActiveX, Java Applet and Silverlight though, but those external escapes are often another sandbox of its own, despite all of them being a horrible one...
But with the stabilization of asm.js and later WebAssembly, all of them is gone with the wind.
Sidenote: Flash's scripting language, ActionScript is also directly responsible for the generational design of Java-ahem-ECMAScript later on, also TypeScript too.
I don't buy it. It might be very useful for a few use cases, but despite all the desktop automation craze and "Claude for cooking" stuff that is inevitably to follow, our computing model for live business applications has, for maintainability, auditability, security, data access, etc. become cloud-centric to a point where running things locally is... kind of pointless for most "real" apps.
Not that I'm not excited about the possibilities in personal productivity, but I don't think this is the way--if it was, we wouldn't have lost, say, the ability to have proper desktop automation via AppleScript, COM, DDE (remember that?) across mainstream desktop operating systems.
I've found it interesting that systemd and Linux user permissions/groups never come into the sandboxing discussions. They're both quite robust, offer a good deal of customization in concert,and by their nature, are fairly low cost.
This is a great example of how useful the File System Access API is.
On http://co-do.xyz/ you can select a directory and let AI get to work inside of it without having to worry about side effects.
The Fily System Access API is the best thing that happened to the web in years. It makes web apps first class productivity applications.
The folder input thing caught me off guard too when I first saw it. I've been building web apps for years and somehow missed that `webkitdirectory` attribute.
What I find most compelling about this framing is the maturity argument. Browser sandboxing has been battle-tested by billions of users clicking on sketchy links for decades. Compare that to spinning up a fresh container approach every time you want to run untrusted code.
The tradeoff is obvious though: you're limited to what browsers can do. No system calls, no arbitrary binaries, no direct hardware access. For a lot of AI coding tasks that's actually fine. For others it's a dealbreaker.
I'd love to see someone benchmark the actual security surface area. "Browsers are secure" is true in practice, but the attack surface is enormous compared to a minimal container.
I'd like to point Simon and others to 2 more things possible in the browser:
1) webcontainer allows nodejs frontend and backend apps to be run in the browser. this is readily demonstrated to (now sadly unmaintained) bolt.diy project.
2) jslinux and x86 linux examples allow running of complete linux env in wasm, and 2 way communication. A thin extension adds networking support to Linux.
so technically it's theoretically possible to run a pretty full fledged agentic system with the simple UX of visiting a URL.
> a robust sandbox for agents to operate in
I would like to humbly propose that we simply provision another computer for the agent to use.
I don't know why this needs to be complicated. A nano EC2 instance is like $5/m. I suspect many of us currently have the means to do this on prem without resorting to virtualization.
Last I looked (a couple of years ago), you could ask the user for read-write access to a directory in Chrome using the File System Access API, however you couldn't persist this access, so the user would have to manually re-grant permission every time you reloaded the tab. Has this been fixed yet? It's a showstopper for the most interesting uses of the File System Access API IMO.
Since AI became capable of long-running sessions with tool calls, one VM per AI as a service became very lucrative. But I do think a large amount of these can indeed run in the browser, especially all the ones that essentially just want to live-update and execute code, or run shells on top of a mounted file system. You can actually do all of this in the user's browser very efficiently. There are two things you lose though: collaboration (you can do it, but it becomes a distributed problem if you don't have a central server) and working in the background (you need to pause all work while the user's tab is suspended or closed).
So if you can work within the constraints there are a lot of benefits you get as a platform: latency goes down a lot, performance may go up depending on user hardware (usually more powerful than the type of VM you'd use for this), bandwidth can go down significantly if you design this right, and your uptime and costs as a platform will improve if you don't need to make sure you can run thousands of VMs at once (or pay a premium for a platform that does it for you)[1]
All that said I'm not sure trying to put an entire OS or something like WebContainers in the user's browser is the way, I think you need to build a slightly custom runtime for this type of local agentic environment. But I'm convinced it's the best way to get the smoothest user experience and smoothest platform growth. We did this at Framer to be able to recompile any part of a website into React code at 60+ frames per second, which meant less tricks necessary to make the platform both feel snappy and be able to publish in a second.
[1] For big model providers like OpenAI and Anthropic there's an interesting edge they have in that they run a tremendous amount of GPU-heavy loads and have a lot of CPUs available for this purpose.
This sandboxes your file system. That's just one class of problem. People will want to hook this up to their inbox, their calendar, their chats, their source code, their finances, etc. File system secured? Great. Everything else? Not so much.
That said. It's a good start.
Wrong title, if it's "File System Access API (still Chrome-only as far as I can tell)" then it should read "A browser is the sandbox".
At the risk of sounding obvious :
- Chrome (and Chromium) is a product made and driven by one of the largest advertising company (Alphabet, formally Google) as a strategical tool for its business model
- Chrome is one browser among many, it is not a de facto "standard" just because it is very popular. The fact that there are a LOT of people unable to use it (iOS users) even if they wanted to proves the point.
It's quite important not to amalgamate some experimental features put in place by some vendors (yes, even the most popular ones) as "the browser".
At the moment I'm fairly OK using docker + integration scripts / tools that expose host OS functionality (like if it needs screenshots etc).
I know there are lots of good arguments why docker isn't perfect isolation. But it's probably 3 orders of magnitude safer than running directly on my computer, and the alignment with the existing dev ecosystem (dev containers, etc) makes it very streamlined.
It's fascinating that browsers are one of the most robust and widely available sandboxing system and we are yet to make a claude-code/gemini-cli like agent that runs inside the browser.
Browsers as agent environment opens up a ton of exciting possibilities. For example, agents now have an instant way to offer UIs based on tech governed by standards(HTML/CSS) instead of platform specific UI bindings. A way to run third party code safely in wasm containers. A way to store information in disk with enough confidence that it won't explode the user's disk drive. All this basically for free.
My bet is that eventually we'll end up with a powerful agentic tool that uses the browser environment to plan and execute personal agents or to deploy business agents that doesn't access system resources any more than browsers do at the moment.
Agree! And this is why it is a bad idea IMHO for agents to sit at the abstraction layer of browser or below (OS). Even at the browser-addon level it's dangerous. It runs with the user’s authority across contexts and erodes zero-trust by becoming a confused deputy: https://en.wikipedia.org/wiki/Confused_deputy_problem
We applied a lot of the technical hacks described in this article and the original one to provide a full Linux environment (including networking and mounting directories) running inside the browser. https://endor.dev/s/lamp
What I'd really like to see is some kind of iframe that pins JS/wasm code within it to a particular bundle hash and prevents modification at runtime (even from chrome extensions).
Something more like a TEE inside the browser of sorts. Not sure if there is anything like this.
If you ask a blacksmith how to fix a screw, he might say "just hit one strike with this good old hammer". Coding agents are integral to IDEs.
> Paul Kinlan is a web platform developer advocate at Google
That's enough reason for me to say, f** no. Google will try as hard as possible to promote this even if it's not technically the best solution.
Using anything other than a Linux CLI and file system seems like a misstep to me - it’s what LLMs know best and can use best.
A sandbox is meant to be a controlled environment where you can execute code safely. Browsers can access your email, banking, commerce and the keys to your digital life.
Browsers are closer to operating systems rather than sandboxes, so giving access of any kind to an agent seems dangerous. In the post I can see it's talking about the file access API, perhaps a better phrasing is, the browser has a sandbox?
The browser is the most effective environment to distribute and isolate applications. We have built technologies for years to leverage these capabilities to run legacy Java (CheerpJ) and x86 binaries (Cheerpx / WebVM).
We are soon going to release a new technology, built on top of the same stack, to allow full-stack development completely in the browser. It's called BrowserPod and we think it will be a perfect fit for agents as well.
Unfortunately sandboxing your computer from the browser won’t sandbox gullible agents away from your online banking.
that interesting insight, i just added file system support to my internal tool, i thought this was not possible in firefox but the workaround you mentioned works. thanks
by any chance anyone knows if users clicks can be captured for a website/tab/iframe for screen recording. i know i can record screen but i am wondering if this metadata can be collected.
I like the perspective used to approach this. Additionally, the fact that major browsers can accept a folder as input is new to me and opens up some exciting possibilities.
Are you aware of any lightweight sandboxes for Python? not browser based
I’m not entirely sure this is better than native sandboxes?
Good time to surface the limitations of a Content Security Policy: https://github.com/w3c/webappsec-csp/issues/92
Also the double iframe technique is important for preventing exfiltration through navigation, but you have to make sure you don't allow top navigation. The outer iframe will prevent the inner iframe from loading something outside of the frame-src origins. This could mean restricting it to only a server which would allow sending it to the server, but if it's your server or a server you trust that might be OK. Or it could mean srcdoc and/or data urls for local-only navigation.
I find the WebAssembly route a lot more likely to be able to produce true sandboxen.
> Over the last 30 years, we have built a sandbox specifically designed to run incredibly hostile, untrusted code from anywhere on the web
Browser sandboxes are swiss cheese. In 2024 alone, Google reported 75 zero-day exploits that break out of their browser's sandbox.
Browsers are the worst security paradigm. They have tens of millions of lines of code, far more than operating system kernels. The more lines of code, the more bugs. They include features you don't need, with no easy way to disable them or opt-in on a case-by-case basis. The more features, the more an attacker can chain them into a usable attack. It's a smorgasbord of attack surface. The ease with which the sandbox gets defeated every year is proof.
So why is everyone always using browsers, anyway? Because they mutated into an application platform that's easy to use and easy to deploy. But it's a dysfunctional one. You can't download and verify the application via signature, like every other OS's application platform. There's no published, vetted list of needed permissions. The "stack" consists of a mess of RPC calls to random remote hosts, often hundreds if not thousands required to render a single page. If any one of them gets compromised, or is just misconfigured, in any number of ways, so does the entire browser and everything it touches. Oh, and all the security is tied up in 350 different organizations (CAs) around the world, which if any are compromised, there goes all the security. But don't worry, Google and Apple are hard at work to control them (which they can do, because they control the application platform) to give them more control over us.
This isn't secure, and there's really no way to secure it. And Google knows that. But it's the instrument making them hundreds of billions of dollars.
This is obvious isn’t it - headless browsers are the best sandbox if you want the features and can afford the weight.
This is the kind of thing that the browser should not need to do. This is the kind of thing that the operating system should be doing. The operating system (the thing you use to run programs securely) should be securing you from bad anything, not just bad native applications.
A large part of the web is awful because of all the things browsers must do that the operating system should already be doing.
We have all tolerated stagnant operating systems for too long.
Plan 9's inherent per-process namespacing has made me angry at the people behind Windows, MacOS, and Linux. If something is a security feature and it's not an inherent part of how applications run, then you have to opt in, and that's not really good enough anymore. Security should be the default. It should be inherent, difficult to turn off for a layman, and it should be provided by the operating system. That's what the operating system is for: to run your programs securely.
The browser being the sandbox isn't a good thing. It's frankly one of the greatest failures of personal computer operating systems.
Can you believe that if you download a calculator app it can delete your $HOME? What kind of idiot designed these systems?
An interesting technique.
The problems discussed by both Simon and Paul where the browser can absolutely trash any directory you give it is perhaps the paradigmatic example where git worktree is useful.
Because you can check out the branch for the browser/AI agent into a worktree, and the only file there that halfway matters is the single file in .git which explains where the worktree comes from.
It's really easy to fix that file up if it gets trashed, and it's really easy to use git to see exactly what the AI did.
[dead]
[dead]
Coding agents may become trivial artifacts to be assembled by developers themselves from libraries, given the well-defined workflow. If it is a homegrown agent then you probably don't need a sandbox to run in.
This is an entry on my link blog - make sure to read the article it links to for full context, my commentary alone might not make sense otherwise: https://aifoc.us/the-browser-is-the-sandbox/