> doesn't make financial sense to self-host I guess that's debatable. I regularly run...

btbuildem • yesterday at 2:46 PM • 8 replies • view on HN

> doesn't make financial sense to self-host

I guess that's debatable. I regularly run out of quota on my claude max subscription. When that happens, I can sort of kind of get by with my modest setup (2x RTX3090) and quantized Qwen3.

And this does not even account for privacy and availability. I'm in Canada, and as the US is slowly consumed by its spiral of self-destruction, I fully expect at some point a digital iron curtain will go up. I think it's prudent to have alternatives, especially with these paradigm-shattering tools.

Replies

jsheard • yesterday at 2:58 PM

I think AI may be the only place you could get away with calling a 2x350W GPU rig "modest".

That's like ten normal computers worth of power for the GPUs alone.

➕ show 3 replies

wongarsu • yesterday at 3:19 PM

Self-hosting training (or gaming) makes a lot of sense, and once you have the hardware self-hosting inference on it is an easy step.

But if you have to factor in hardware costs self-hosting doesn't seem attractive. All the models I can self-host I can browse on openrouter and instantly get a provider who can get great prices. With most of the cost being in the GPUs themselves it just makes more sense to have others do it with better batching and GPU utilization

➕ show 1 reply

Aurornis • yesterday at 4:06 PM

> I regularly run out of quota on my claude max subscription. When that happens, I can sort of kind of get by with my modest setup (2x RTX3090) and quantized Qwen3.

When talking about fallback from Claude plans, The correct financial comparison would be the same model hosted on OpenRouter.

You could buy a lot of tokens for the price of a pair of 3090s and a machine to run them.

➕ show 1 reply

mythz • yesterday at 3:02 PM

Did the napkin math on M3 Ultra ROI when DeepSeek V3 launched: at $0.70/2M tokens and 30 tps, a $10K M3 Ultra would take ~30 years of non-stop inference to break even - without even factoring in electricity. Clearly people aren't self-hosting to save money.

I've got a lite GLM sub $72/yr which would require 138 years to burn through the $10K M3 Ultra sticker price. Even GLM's highest cost Max tier (20x lite) at $720/yr would buy you ~14 years.

➕ show 5 replies

visarga • yesterday at 3:45 PM

Your $5,000 PC with 2 GPUs could have bought you 2 years of Claude Max, a model much more powerful and with longer context. In 2 years you could make that investment back in pay raise.

➕ show 3 replies

7thpower • yesterday at 3:31 PM

Unless you already had those cards, it probably still doesn’t make sense from a purely financial perspective unless you have other things you’re discounting for.

Doesn’t mean you shouldn’t do it though.

flaviolivolsi • yesterday at 3:02 PM

How does your quantized Qwen3 compares in code quality to Opus?

➕ show 2 replies

alt Hacker News

Replies