Serving barely useful GLM 5.2 costs what? $15k? Actually useful is like $50k? You’ll never recoup the cost unless you ‘locally’ means ‘inference provider is not the model provider’?
Yes they mean open weight models offered by various providers
$15k or $50k is pretty cheap all things considered (a year ago it would have been more expensive, one person can spend that in a month or two)
I bought my spark and the models have already improved in that time (qwen3.6, speculative decoding 2x tgen, diffusion gemma 4x tgen) and I expect this to improve. Look out another 2-3 years, local is going to be very competitive.
You can recoup the costs quicker if you resell access to your local LLM on a reselling service.
Not "local" in the literal sense, but I set it up to serve at half quant for $23/hr and full quant for $35/hr.
You don't need to have it always on? This is a far cry from "$200/month," but I do not think it's $50k for "useful." Do you see it differently?