> Even at non VC subsidized $/token prices, its still much cheaper to run cloud based models.
On a price-per-wattage level, this is not true, people have done the math on /r/LocalLLaMA many times over[1]. Local models, while not as good as premier models (GPT 5.5, etc.), are like ~80%+ of the way there, and often converge to a similar solution after a few dead ends.
[1] https://www.reddit.com/r/LocalLLM/comments/1kshq4f/electrici...
Maybe not per watt, but unless you already happen to own a 3900 cited by that post, you'd have to buy that as well, which is currently selling for around $1400 used.