Oh, I agree. Last year I tried making each model a "daily driver", including small ones like gpt5-mini / haiku, and open ones, like glm, minimax and even local ones like devstral. They can all do some tasks reliably, while struggling at other tasks. But yeah, there comes a point where, depending on your workflows, some smaller / cheaper models become good enough.
The problem is with overhypers, that they overhype small / open models and make it sound like they are close to the SotA. They really aren't. It's one thing to say "this small model is good enough to handle some tasks in production code", and it's a different thing to say "close to opus". One makes sense, the other just sets the wrong expectations, and is obviously false.
I am desperate for the tooling that puts me back in charge. And just has the models as advisor. In which case the "smart level" is just a dial.
I'm probably going to have to make it myself.
There is no doubt that for many tasks the SotA models of OpenAI and Anthropic are better than the available open weights models.
Nevertheless, I do not believe that either OpenAI or Anthropic or Google know any secret sauce for better training LLMs. I believe that their current superiority is just due to brute force. This means that their LLMs are bigger and they have been trained on much more data than the other LLM producers have been able to access.
Moreover, for myself, I can extract much more value from an LLM that is not constrained by being metered by token cost and where I have full control on the harness used to run the model. Even if the OpenAI or Anthropic models had been much better in comparison with the competing models, I would have still been able to accomplish more useful work with an open-weights model.
I have already passed once through the transition from fast mainframes and minicomputers that I was accessing remotely by sharing them with other users, to slow personal computers over which I had absolute control. Despite the differences in theoretical performance, I could do much more with a PC and the same is true when I have absolute control over an LLM.