> And I can always see its full thoughts, don't have to worry about where my data is getting sent, and know it can't get secretly nerfed.
For this reason I wonder if local models are a potential business opportunity. Provide the service to engineering teams to give them a pre-built and setup GPU rig they can run in a closet. No need to worry about all the things you mentioned and clients can rest-assured their data isn't disappearing into a sketchy data center. There might be regulatory reasons that make on-prem setups appealing as well.
On-premise (1960-2010) -> Cloud (2010-2026) -> On-premise (2026+)?
I think the next step to anyone but overbloated USA models is to follow https://chatjimmy.ai/ with one of the qwen models. If they can mass produce something at relative cost, these would be awesome sidecars.
This is, as far as I know, the business model of coys like mistral and cohere