The quality of local models has increased significantly since this time last year. As have the options for running larger local models.
These takes are terrible.
1. It costs 100k in hardware to run Kimi 2.5 with a single session at decent tok p/s and its still not capable for anything serious.
2. I want whatever you're smoking if you think anyone is going to spend billions training models capable of outcompeting them are affordable to run and then open source them.
The quality of local models is still abysmal compared to commercial SOTA models. You're not going to run something like Gemini or Claude locally. I have some "serious" hardware with 128G of VRAM and the results are still laughable. If I moved up to 512G, it still wouldn't be enough. You need serious hardware to get both quality and speed. If I can get "quality" at a couple tokens a second, it's not worth bothering.
They are getting better, but that doesn't mean they're good.