logoalt Hacker News

pants2today at 5:00 AM8 repliesview on HN

Cool idea. Just some back-of-the-envelope math here (not trusting what's on their site):

My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B. Darkbloom's pricing is $0.20 per Mtok output.

That's about $2.24/day or $67/mo revenue if it's fully utilized 24/7.

Now assuming 50W sustained load, that's about 36 kWh/mo, at ~$.25/kWh approx. $9/mo in costs.

Could be good for lunch money every once in a while! Around $700/yr.


Replies

mavamaartentoday at 5:24 AM

Well. Running your machine to do inference will utilize more than 50W sustained load, I'd say more than double that. Plus electricity is more expensive here (but granted, I do have solar panels). Plus don't forget to factor in that your hardware will age faster.

I'd say it's not worth it. But the idea is cool.

show 2 replies
kennywinkertoday at 5:35 AM

Their example big earner models are FLUX.2 Klein 4B and FLUX.2 Klein 9B, which i imagine could generate a lot more tokens/s than a 26B model on your machine.

For Gemma 4 26B their math is:

single_tok/s = (307 GB/s / 4 GB) * 0.60 = 46.0 tok/s

batched_tok/s = 46.0 * 10 * 0.9 = 414.4 tok/s

tok/hr = 414.4 * 3600 = 1,492,020

revenue/hr = (1,492,020 / 1M) * $0.200000 = $0.2984

I have no idea if that is a good estimate of how much an M5 Pro can generate - but that’s what it says on their site.

They do a bit of a sneaky thing with power calculation: they subtract 12Ws of idle power, because they are assuming your machine is idling 24/7, so the only cost is the extra 18W they estimate you’ll use doing inference. Idk about you, but i do turn my machine off when i am not using it.

todotask2today at 5:05 AM

OpenAI has only about 5% paying customers, how does it generate revenue?

I don’t think this is a sustainable business model. For example, Cubbit tried to build decentralised storage, but I backed out because better alternatives now exist, and hardware continues to improve and become cheaper over time.

Your electricity and ownership are going to get lower return and does not actually requce CO2.

nnxtoday at 6:10 AM

> My M5 Pro can generate 130 tok/s (4 streams) on Gemma 4 26B.

This seems high. At which quantization? Using LM Studio or something else?

Note: Darkbloom seems to run everything on Q8 MLX.

chaoz_today at 5:05 AM

Genuinely curious, is there any way to estimate amortization of Mac?

I’d imagine 1 year of heavy usage would somehow affect its quality.

show 1 reply
xendotoday at 5:06 AM

Any idea what makes for such a diff between your and theirs numbers? Batching? Or could they do a crazy prefix caching across all nodes to reduce the actual processing.

znnajdlatoday at 5:37 AM

Maybe lunch money for you, but there are people in some parts of the world who live on $200/month. Like Ukraine.

show 1 reply
MrDrMcCoytoday at 5:04 AM

Don't forget to factor in cooling costs.

show 1 reply