> It works, I've shipped this as a "local inference"/poor person's ollam...

subhobroto • today at 3:48 AM • 4 replies • view on HN

> It works, I've shipped this as a "local inference"/poor person's ollama for low-end llm tasks like search

fantastic!

> the model download is orders of magnitude greater than downloading the browser itself, and something that needs to happen before you get your first token back

sure but does this mean the model is lazily downloaded? that is, if I used this and I am the first time the model was called, the user would be waiting until the model was downloaded at that point?

that sounds like a horrible user experience - maybe chrome reduces the confusion by showing a download dialog status or similar?

also, any idea what the on disk impact is?

Replies

avaer • today at 5:07 AM

The model download is lazy and cached, so it's a one-time cost presumably across all origins (I assume so since the alternative would be a trivial DoS waiting to happen).

So it's once per browser, not once per site.

You can track the download state yourself and pop whatever UI you want.

tastroder • today at 5:25 AM

chrome://on-device-internals reports "Model Name: v3Nano Version: 2025.06.30.1229 Folder size: 4,072.13 MiB" on a random Windows machine I just checked.

➕ show 1 reply

why_is_it_good • today at 4:33 AM

> Storage: At least 22 GB of free space on the volume that contains your Chrome profile.

➕ show 3 replies

jfoster • today at 7:05 AM

Doesn't sound great, but consider how much better this is than every webpage trying to load their own models.

If it turns out useful enough I'm sure browsers will just start including it as (perhaps optional?) part of installation.

alt Hacker News

Replies