Economies of scale are a fact of nature and aren’t going to be subverted in the future by even the most advanced local models
The economies of scale gains are lost because you still have a middle man hosting provider who wants to profit too.
Over the long term it's always been better to buy than to rent, even if the renting option is technically more efficient on the GPUs, you don't have to pay some hosting providers profit margin.
Things can get both more expensive and cheaper at scale, hence the term.
For example (and relevant to AI) I can generate electricity on my roof at $0.20-25/kWh, batteries included. In California the electric utility can’t offer it cheaper than $0.30-0.50/kWh. Therefore at scale, electricity is actually more expensive.
There are many such examples.
Setting aside that very little about economics rises to the level of "facts of nature" like physics...
What makes you so certain that economies of scale won't work the opposite way you imagine? E.g., if model improvement tapers off, but RAM costs decline (hard to believe atm, but historically likely), then eventually everyone will be able to run SOTA models on their personal hardware.
Heck, even if model sizes simply grow more slowly than RAM costs decrease, the same would happen.
... said the IBM executive to a young Bill Gates.
Which is of course why, if you want to render 3d scenes to play a video game, you have to rent time on a mainframe system. I don’t see that changing ever - it’s just economies of scale!
(sarcasm, btw)