A 1B model at 2-bit quantization is about the size of the average web page anymore. With some WebGPU support you could run such a model in a browser.
I'm half joking. Web pages are ludicrously fat these days.
Something like this? https://github.com/huggingface/transformers.js-examples/tree...
Something like this? https://github.com/huggingface/transformers.js-examples/tree...