They do stand in front of a great opportunity that would also benefit consumers, which seems rare in the llm era.
If people can get opus4.6/gpt5.5-like models locally, labs could raise their prices and sell token speed, better reasoning, mobile-focused improvements, you name it.
Not all consumers are power users and many will be happy to pay for flexibility.
Most people don't actually want to manage models, updates, context limits, quantization, etc. They just want the thing to work everywhere