> The companies that are entirely AI-dependent may need to raise prices dramatically as AI prices go up.
It's not that clear. Sure, hardware prices are going up due to the extremely tight supply, but AI models are also improving quickly to the point where a cheap mid-level model today does what the frontier model did a year ago. For the very largest models, I think the latter effect dominates quite easily.
There's only so far engineers can optimise the underlying transformer technique, which is and always has been doing all the heavy lifting in the recent ai boom. It's going to take another genius to move this forward. We might see improvements here and there but the magnitudes of the data and vram requirements I don't think will change significantly
We are processing same data for the last 2 years.
Inference prices droped like 90 percent in that time (a combination of cheaper models, implicit caching, service levels, different providers and other optimizations).
Quality went up. Quantity of results went up. Speed went up.
Service level that we provide to our clients went up massively and justfied better deals. Headcount went down.
What's not to like?
You also have to look at how exposed your vendors are to cost increases as well.
Your company may have the resources to effectively shift to cheaper models without service degradation, but your AI tooling vendors might not. If you pay for 5 different AI-driven tools, that's 5 different ways your upstream costs may increase that you'll need to pass on to customers as well.
>> The companies that are entirely AI-dependent may need to raise prices dramatically as AI prices go up.
> It's not that clear. Sure, hardware prices are going up due to the extremely tight supply, but AI models are also improving quickly to the point where a cheap mid-level model today does what the frontier model did a year ago.
I agree; I got some coding value out of Qwen for $10/m (unlimited tokens); a nice harness (and some tight coding practices) lowers the distance between SOTA and 6mo second-tier models.
If I can get 80% of the way to Anthropic's or OpenAI's SOTA models using 10$/m with unlimited tokens, guess what I am going to do...