if you do the electricity math you'll see that you pay more on local models while getting less ...

nok22kon • today at 7:51 AM • 3 replies • view on HN

if you do the electricity math you'll see that you pay more on local models while getting less (local is more heavily quantized) compared with OpenRouter.

I'm not talking local Gemma/Qwen vs cloud Opus, but against OpenRouter same Gemma/Qwen

there are reasons to run local - privacy, availability, but cost is not one of them

Replies

calgoo • today at 12:08 PM

I am allowed to plug in 800w of solar panels into a wall socket here in spain. That would more then cover my current computer with 16gb vram. Now if i went and built a LLM server, at full load i would probably be closer to 3600w (Dual Epyc CPUs that gives you 8 x16 PCI channels and up to 8 cards - Way overkill, i know). If i half that with 1 EPYC and 4 x16 PCI channels, and add the same amd 7800xt i currently have then i should in theory be able to run at around 1800w under full load. Now that could still be covered with a 2000w solar install (get a professional setup OR get a battery unit like a EcoFlow that can output 2000w and can input about the same amount of solar).

Now, this all brings the upfront costs way up, the solar panels are cheap, its all the rest around them that tends to cost money.

manarth • today at 8:17 AM

That's assuming consumption pricing remains as-is.

There has been a lot of market-subsidy in AI which is starting to fade away: e.g. the copilot quotas/pricing. When VC switches from investing to wanting a return, the price equation is likely to change.

alt Hacker News

Replies