Here is a trend I'm noticing:
- GPT-5 mini costs $0.25/$2 and will be discontinued in December.
- GPT-5.4 mini costs $0.75/$4.5 and is supposed to be the replacement.
- GPT-5.4 nano costs $0.2/$1.25 and, while it ranks better in benchmarks than GPT-5 mini, it's not even close when you test it in real scenarios.
So you're left being forced to go to GPT 5.4 mini if you use 5 mini today.
The same thing is happening here as their “Luna“ model will cost $1/$6.
Can't we just stay with the models we actually want? I don't need GPT 5.4 mini. GPT-5 does the job.
Maybe it’s the realization that it was never that cheap in the first place and they're forcing us to upgrade in a slow and painful way.
It’s the same as the SaaS model. Price keeps going up, and to justify it they keep forcing you to upgrade to new versions with features that nobody asked for.
I've struggled with this. You definitely can have great cheap models. There are many of them open source and served profitably by neo-clouds. The big labs have basically given up on cheap models, and it is frustrating. It means applications are not likely to build as much on them anymore (we are shifting workloads from Haiku/Sonnet to Deepseek v4, for example).
I suspect the problem is that they need to charge a lot to keep revenue numbers up, and they are more worried about cannibalizing themselves than others cannibalizing them.
Good observations. There's definitely a trend in pricing increasing but also balanced by innovations and availability of other models (both open and closed) emerging as alternatives. It's natural for the labs to explore how much they can push pricing, and for competitors to explore how they can treat that margin as their opportunity to grow their business.
Eventually the pricing should be more stable.
Its happening to Anthropic Haiku and Gemini Flash/Flash lite. All of them are increasing prices and deprecating cheap models.
Each model release gives an opportunity to reduce the number of old models still on offer, and charge a higher, less-subsidized tier. The trick is to charge a subsidized price that is less than an M3 Ultra, so they continue paying you rent, instead of a one-time fixed cost. So far open models can't compete with Opus 4.5 but as soon as it can, people will be looking at buying devices that can run that model locally.
We are a claude shop but we already bought two mac studios to start migrating less complex but still agentic workflows there. We will break even on those in less than a year.
On Nano "it's not even close when you test it in real scenarios" - what have you seen? What kind of things can GPT-5 Mini handle that GPT-5.4 Nano cannot?
> stay with the models we actually want
If you want control over the models you use, you have to self-host.
5.5 is smart enough for 99% of my tasks. I need that level of intelligence at ever decreasing prices.
I think it's more that they're abandoning simpler AI tasks to chinese models. Qwen 35b and deepseek flash are better than gp5 mini on my tasks and way cheaper.
Hardware hosting old models isn't hosting new models. If you want consistent models, host your own open weights ones.
> Maybe it’s the realization that it was never that cheap in the first place and they're forcing us to upgrade in a slow and painful way.
All the analysis I have seen points to frontier models being profitable to serve. It’s using 50% or more of your GPUs for research plus CapEx for capacity expansion that makes these businesses so heavily cash-negative.
What you are observing is downstream of another detail. It gets more expensive to serve a model as utilization goes down. Plus the opportunity cost vs newer, more-profitable models.
There are plenty of valid reasons to critique here. “OpenAI is lying about this being a sustainable price to serve” is not one of them.
who tf would use mini when you have dsv4 flash
discontinuing the cheaper options is a risky move for openai
will trigger re-evaluations of models by other labs + inference providers
Yeah, this is the classic silicon valley strategy of selling at a loss and then once they have captured the market inflate prices.
See Uber, Netflix, etc.
No, you can't. These companies have two infrastructures: model training and model inference.
Inference needs to cache, it can't cache random model data, so it's essentially dedicated; it can't spin up models on demand, it has to know what demand is coming.
These companies are going to end up with very few models offered and that's probably generous. They might end up with just one model and you pay for removing it's safe guards.
If you have no need for Anthropic/OpenAI's frontier model capability, you may be better served with an open-weight model that can't be taken away.
Edit:
> GPT-5 does the job.
I bring up DeepSeek V4 Flash a lot on HN, but I want to mention that according to Artificial Analysis, it trades blows with GPT-5 (high) (from August, 2025) [0]
[0]: https://artificialanalysis.ai/models/comparisons/deepseek-v4...