I've been super impressed by qwen3:0.6b (yes, 0.6B) running in Ollama.
If you have very specific, constrained tasks it can do quite a lot. It's not perfect though.
https://tools.nicklothian.com/llm_comparator.html?gist=fcae9... is an example conversation where I took OpenAI's "Natural language to SQL" prompt[1], send it to Ollama:qwen3:0.6b and the asked Gemini Flash 3 to compare what qwen3:0.6b did vs what Flash did.
Flash was clearly correct, but the qwen3:0.6b errors are interesting in themselves.
[1] https://platform.openai.com/docs/examples/default-sql-transl...
I’ve experimented with several of the really small models. It’s impressive that they can produce anything at all, but in my experience the output is basically useless for anything of value.