> Large language models are something else entirely*. They are black boxes. You cannot audit them. You cannot truly understand what they do with your data. You cannot verify their behaviour. And Mozilla wants to put them at the heart of the browser and that doesn't sit well.
Am I being overly critical here or is this kind of a silly position to have right after talking about how neural machine translation is okay? Many of Firefox's LLM features like summarization afaik are powered by local models (hell even Chrome has local model options). It's weird to say neural translation is not a black box but LLMs are somehow black boxes that we cannot hope to understand what they do with the data, especially when viewed a bit fuzzily LLMs are scaled up versions of an architecture that was originally used for neural translation. Neural translation also has unverifiable behavior in the same sense.
I could interpret some of the data talk as talking about non local models but this very much seems like a more general criticism of LLMs as a whole when talking about Firefox features. Moreover, some of the critiques like verifiability of outputs and unlimited scope still don't make sense in this context. Browser LLM features except for explicitly AI browsers like Comet have so far had some scoping to their behavior, either in very narrow scopes like translation or summarization. The broadest scope I can think of is the side panels that show up which allow you to ask about a web page with context. Even then, I do not see what is inherently problematic about such scoping since the output behavior is confined to the side panel.
Aside: Does anyone actually use summarization features? I've never once been tempted to "summarize" because when I read something I either want to read the entire thing, or look for something specific. Things I want summarized, like academic papers, already have an abstract or a synopsis.
Looking back with fresh eyes, I definitely think I could’ve presented what I’m trying to say better.
On a purely technical play, you’re right that I’m drawing a distinction that may not hold up purely on technical grounds. Maybe the better framing is: I trust constrained, single purpose models with somewhat verifiable outputs (seeing text go in, translated text go out, compare its consistency) more than I trust general purpose models with broad access to my browsing context, regardless of whether they’re both neural networks under the hood.
WRT to the “scope”, maybe I have picked up the wrong end of the stick with what Mozilla are planning to do - but they’ve already picked all the low hanging fruit with AI integration with the features you’ve mentioned and the fact they seem to want to dig their heels in further, at least to me, signals that they want deeper integration? Although who knows, the post from the new CEO may also be a litmus test to see what the response to that post elicits, and then go from there.
Firefox should look like Librewolf first of all, Librewolf shouldn’t have to exist. Mozilla’s privacy stuff is marketing bullshit just like Apple. It shouldn’t be doing ANYTHING that isn’t local only unless it’s explicitly opt in or user UI action oriented. The LLM part is absurd bc the entire overton window is in the wrong place.
The thing about translation, even a human translator will sometimes make silly mistakes unless they know the domain really well. So LLM are not any worse. Translation is a problem with no deterministic solution (rule-based translation had always been a bad joke). Properly implemented deterministic search/information retrieval, on the other hand, works extremely well. So well it doesn't really need any replacement - except when you also have some extra dynamics on top like "filtering SEO slop" - and that's not something LLMs can improve at all.
No, it is disqualifyingly clueless. The author defends one neural network, one bag of effectively-opaque floats that get blended together with WASM to produce non-deterministic outputs which are injected into the DOM (translation), then righteously crusades against other bags of floats (LLMs).
From this point of view, uBlock Origin is also effectively un-auditable.
Or your point about them maybe imagining AI as non-local proprietary models might be the only thing that makes this make sense. I think even technical people are being suckered by the marketing that "AI" === ChatGPT/Claude/Gemini style cloud-hosted proprietary models connected to chat UIs.
To be more charitable to TFA, machine translation is a field where there aren't great alternatives and the downside is pretty limited. If something is in another language you don't read it at all. You can translate a bunch of documents and benchmark the result and demonstrate that the model doesn't completely change simple sentences. Another related area is OCR - there are sometimes mistakes, but it's tractable to create a model and verify it's mostly correct.
LLMs being applied to everything under the sun feels like we're solving problems that have other solutions, and the answers aren't necessarily correct or accurate. I don't need a dubiously accurate summary of an article in English, I can read and comprehend it just fine. The downside is real and the utility is limited.