> or in the cloud but way more expensive then it is today.
Why? It's widely understood that the big players are making profit on inference. The only reason they still have losses is because training is so expensive, but you need to do that no matter whether the models are running in the cloud or on your device.
If you think about it, it's always going to be cheaper and more energy-efficient to have dedicated cloud hardware to run models. Running them on your phone, even if possible, is just going to suck up your battery life.
> It's widely understood that the big players are making profit on inference.
This is most definitely not widely understood. We still don't know yet. There's tons of discussions about people disagreeing on whether it really is profitable. Unless you have proof, don't say "this is widely understood".
The big players are plausibly making profits on raw API calls, not subscriptions. These are quite costly compared to third-party inference from open models, but even setting that up is a hassle and you as a end user aren't getting any subsidy. Running inference locally will make a lot of sense for most light and casual users once the subsidies for subscription access cease.
Also while datacenter-based scaleout of a model over multiple GPUs running large batches is more energy efficient, it ultimately creates a single point of failure you may wish to avoid.
> It's widely understood that the big players are making profit on inference.
Are they? Or are they just saying that to make their offerings more attractive to investors?
Plus I think most people using agents for coding are using subscriptions which they are definitely not profitable in.
Locally running models that are snappy and mostly as capable as current sota models would be a dream. No internet connection required, no payment plans or relying on a third party provider to do your job. No privacy concerns. Etc etc.
Laptop/desktop could work. Most systems are on charger most of time anyway
> It's widely understood that the big players are making profit on inference.
I love the whole “they are making money if you ignore training costs” bit. It is always great to see somebody say something like “if you look at the amount of money that they’re spending it looks bad, but if you look away it looks pretty good” like it’s the money version of a solar eclipse
> It's widely understood that the big players are making profit on inference.
If you add in the cost of training, it’s not profitable.
Not including the cost of training is a bit like saying the only cost of a cup of coffee is the paper cup it’s in. The only way OpenAI gets to charge for inference is by selling a product people can’t get elsewhere for much cheaper, which means billions in R&D costs. But because of competition, each model effectively has a “shelf life”.