logoalt Hacker News

tyushktoday at 5:35 PM0 repliesview on HN

I don't think local as it stands with browsers will take off simply from the lead time (of downloading the model), but a new web API for LLMs could change that. Some standard API to communicate with the user's preferred model, abstracting over local inference (like what Chrome does with Gemini Nano (?)) and remote inference (LM Studio or calling out to a provider). This way, every site that wants a language model just has to ask the browser for it, and they'd share weights on-disk across sites.