I’m super impressed by how "zillions of lines of code" got re-branded as a reasonable metric by which to measure code, just because it sounds impressive to laypeople and incidentally happens to be the only thing LLMs are good at optimizing.
> According to Perplexity, my AI chatbot of choice, this week‑long autonomous browser experiment consumed in the order of 10-20 trillion tokens and would have cost several million dollars at then‑current list prices for frontier models.
Don't publish things like that. At the very least link to a transcript, but this is a very non-credible way of reporting those numbers.
I think it's impressive for what it is: this level of complexity being reached by an ai-only workflow. Previously, anything of modest complexity required a lot of human guidance - and even with that had some serious shortcomings and crutches. If you extrapolate that the models themselves, the frameworks for inter-model workflows, the tooling available to the models and the hardware running them are all accelerating - it's not hard to envision where this will get to, and that this is a notable achievement particlarly when comparing with the amount of effort and resources put into what we currently see in a browser engine: many decades and countless millions of man-hours.
Fully agree that the original authors made some unsubstantiated and unqualified claims about what was done - which is sad, because it was still a huge accomplishment as i see it.
From an engineer working on this here on HN:
> ...while far off from feature parity with the most popular production browsers today...
What a way to phrase it!
You know, I found a bicycle in the trash. It doesn't work great yet, but I can walk it down a hill. While far off from the level of the most popular supercars today, I think we have made impressive progress going down the hill.
Just had my manager submit 3 PRs in a language he doesn’t understand (rust) and hasn’t ran or tested and is demanding quick reviews for hundreds of LoCs. These are tools but some people are clueless..
If you want to learn more about the Cursor project directly from the source I conducted a 47 minute interview with Wilson Lin, the developer behind FastRender, last week.
We talked about dependencies, among a whole bunch of other things.
You can watch the full video on YouTube or read my extracted highlights here: https://simonwillison.net/2026/Jan/23/fastrender/
I don't think the point was to say "look, AI can just take care of writing a browser now". I think it was to show just how far the tools have come. It's not meant to be production quality, it's meant to be an impressive demo of the state of AI coding. Showing how far it can be taken without completely falling over.
EDIT: I retract my claim. I didn't realize this had servo as a dependency.
You would think a CEO with a product that caters to developers would know that everyone was going to clone the repo and check his work. He just squandered a whole lot of credibility.
Is there a way to measure the entropy of a piece of software?
Is entropy increasing or decreasing the longer agents work on a code base? If it's decreasing, no matter how slowly, theoretically you could just say "ok, start over and write version 2 using what you've learned on version 1." And eventually, $XX million dollars and YY months of churning later, you'd get something pretty slick. And then future models would just further reduce X and Y. Right?
Maybe they just need to keep iterating.
> I'd just cloned a copy of Chromium myself, and for all that time and money, independent developers who cloned the repo reported that the codebase is very far from a functional browser. Recent commits do not compile cleanly, GitHub Actions runs on main are failing, and reviewers could not find a single recent commit that was built without errors.
Significant typo I assume?
anyone remember finding the internet explorer control in windows forms, placing it down, adding some buttons, and telling people you made your own web browser? Maybe this exercise is eternal just in different forms
If I was to spend a trillion tokens on a barely working browser I would have started with the source code of Sciter [0] instead. I really like the premise of an electron alternative that compiles to a 5MB binary, with a custom data store based on DyBASE [1] built into the front end javascript so you can just persist any object you create. I was ready to build software on top of it but couldn't get the basic windows tutorial to work.
If we have been complaining about bloat before, the amount of bloat we are going to witness in the future is unfathomable. How can anyone be proud of a claim like "It's 3M+ lines of code across thousands of files." _especially_ when a lot of this code is relying on external dependencies? Less code is almost always better, not more!
I'm also getting really tired of claims like "we are X% more productive with AI now!" (that I'm hearing day in and out at work and LinkedIn of course). Didn't we, as an industry, agree that we _didn't_ know how to measure productivity? Why is everyone believing all of these sudden metrics that try to claim otherwise?
Look, I'm not against AI. I'm finding it quite valuable for certain scenarios -- but in a constrained environment and with very clear guidance. Letting it loose with coding is not one of them, and the hype is dangerous by how much it's being believed.
When AI does `x`, check with people familiar with `x`.
Thats like the entire hype cycle: LLM builders see a bunch of hyper specific lanuage in fields they're not experts in and thing 'wow, AI is really smart!'
Our modern economy is nearly entirely built on useless bullshit, this is just what it looks like when the ouroboros starts devouring its own tail. It doesn't matter that the product doesn't work; the hype is the product. In our collective nihilism, we have productized faith itself.
People thinking this does not matter just because the code is awful, it used dependencies, or whatever, are missing the point.
6 months ago with previous models this was absolutely impossible. One of the biggest limitations of LLMs is their difficulty with long tasks. This has been steadily improving and this experiment was just another milestone. It will be interesting a year from now to test how much better new models fare at this task.
1% import open-source-incumbent 99% misdirection slop
I mean, maybe they should have started simple and slowly iterated.
project 1: build a text based browser using ratatui and quickjs.
project 2: base it on project 1. convert to gui, pages should render pure html.
project 3: acid1 compliance. Use constraint based programming to output final render, no animation support.
etc etc.Every single high-profile story that shows up on the feeds about how LLMs are just about there and coders are doomed, if you actually read them and are a programmer, seems like a story about how LLMs are bad and generate trash code that rarely even looks superficially good and definitely doesn't work.
There was a story going around about LLMs making minesweeper clones, and they were all terrible in extremely dumb ways. The headline wasn't obvious, so I thought the take that people were getting from it is that AI is making the same dumb mistakes that it was making a year ago. Nope. It was people ranting about how coders are going to be out of a job next week. Meanwhile, none of them can do a minesweeper clone with like 50 working examples online, maybe 8 things you have to do right to be perfect, and 9000 articles about minesweeper and even mathematical papers about minesweeper to make everything about the game and its purpose perfectly clear. And then AI generates buttons that don't do anything and timers that don't stop.
grifters gonna grift
FTA:
> tools like Cursor can be genuinely helpful as glorified autocomplete and refactoring assistants
That suggests a fairly strong anti-AI bias by the author. Anyone who thinks that this is all AI coding tools are today is not actually using them seriously.
That's not to say that this exercise wasn't overhyped, but a more useful, less biased article that's not trying to push an agenda would look at what went right, as well as what went wrong.
AI will never be able to create a browser, just as AI was never able to defeat a chess grandmaster.
I love the quote from Gregory Terzian, one of the servo maintainers:
> "So I agree this isn't just wiring up of dependencies, and neither is it copied from existing implementations: it's a uniquely bad design that could never support anything resembling a real-world web engine."
It hurts, that it wasn't framed as an "Experiment" or "Look, we wanted to see how far AI can go - kinda failed the bar." Like it is, it pours water on the mills of all CEOs out there, that have no clue about coding, but wonder why their people are so expensive when: "AI can do it! D'oh!"