We do, but the vast majority of users interact with centralised models from Open AI, Google Gemini, Grok...
Because small models are just not that good.
The vast majority won't switch until there's a 10x use case. We know they are coming. Why bother hopping?
I'm not sure we can look forward to self-hosted models ever being mainstream.
Like 50% of internet users are already interacting with one of these daily.
You usually only change your habit when something is substantially better.
I don't know how free versions are going to be smaller, run on commodity hardware, take up trivial space and ram etc, AND be substantially better