This is so cool thanks for sharing. I can imagine it’s not technically possible (yet?) but it would be cool if this could simply be run as a browser extension rather than running a docker container
There are a couple of WebGPU LLM platforms available that form the building blocks to accomplish this right from the browser, especially since the models are so small.
https://github.com/mlc-ai/web-llm
https://huggingface.co/docs/transformers.js/en/index
You do have to worry about WebGPU compatibility in browsers though.
It should be possible using native messaging [1] which can call out to an external binary. The 1password extensions use that to communicate with the password manager binary.
[1] https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...
I did actually make a rough proof-of-concept of this! One of my long-term visions is to have it running natively in-browser, and able to automatically fix site issues caused by adblocking whenever they happen.
The PoC is a bit outdated but it's here: https://github.com/brave/cookiemonster/tree/webext