Haiku not getting an update is becoming telling. I suspect we are reaching a point where the low end models are cannibalizing high end and that isn't going to stop. How will these companies make money in a few years when even the smallest models are amazing?
Isn't it pretty common for the smaller models to release a little while after the bigger ones, for all the big model providers?
It seems to be a rule that older models are more expensive than newer ones. The low end models have higher $CPT and worse output. I wonder if the move is to just have one model and quantize if you hit compute constraints
The Gemma models are at this point. A 31B model that can fit on a consumer card is as good as Sonnet 4.5. I haven't put it through as much on the coding front or tool calling as I have the Claude or GPT models, but for text processing it is on par with the frontier models.
Google is putting a lot of research into small models. Most of my AI budget is now going to small models because I am doing lots of tiny tasks that the small models do great with. I would think a decent chunk of Goog's API revenue probably comes from their small models.