The trick with small models is what you ask them to do. I am working on a data extraction app (from emails and files) that works entirely local. I applied for Taalas API because it would be awesome fit.
dwata: Entirely Local Financial Data Extraction from Emails Using Ministral 3 3B with Ollama: https://youtu.be/LVT-jYlvM18