Inference is profitable. Companies lose money because:
1. Training is expensive. Not just compute but getting the data, researchers salaries etc 2. You have to keep producing new models to ensure people use your inference and there seems to be no end to this. So they have to pour more billions to keep the cycle going on 3. People salary and other admin cost are not that high compared to 1 and 2.
So? How does it change the equation?
Nobody is going to charge "inference price" for model usage.
Inference at per-token pricing is profitable.
The article's point is that if you're relying on flat fee subscriptions, a rude awakening may be coming. That seems plausible to me. Issues around token quotas are a frequent topic on HN.