The 3B vision model runs in the browser (after a 3GB model download). There's a very cool demo ...

simonw • today at 5:41 PM • 1 reply • view on HN

The 3B vision model runs in the browser (after a 3GB model download). There's a very cool demo of that here: https://huggingface.co/spaces/mistralai/Ministral_3B_WebGPU

Pelicans are OK but not earth-shattering: https://simonwillison.net/2025/Dec/2/introducing-mistral-3/

Replies

troyvit • today at 7:38 PM

I'm reading this post and wondering what kind of crazy accessibility tools one could make. I think it's a little off the rails but imagine a tool that describes a web video for a blind user as it happens, not just the speech, but the actual action.

➕ show 1 reply

alt Hacker News

Replies