logoalt Hacker News

simonwyesterday at 6:52 PM0 repliesview on HN

There are some spectacular local models for generating text descriptions of images now. I suggest starting with Mistral Small 3.2, Gemma 3 and Qwen 2.5VL - all available via Ollama.

I expect we will see a Qwen 3VL soon.