logoalt Hacker News

jacobgoldyesterday at 7:01 PM4 repliesview on HN

> "~$40k At this price level, you get the next step up in model intelligence. Something pretty close to Claude Opus."

That is equivalent to 16.8 years of Claude Opus 4.8 or Codex GPT 5.5 at $200/mo.

I'm a huge fan of running local models, but they're still wildly expensive, lower quality, and possibly dangerous (if backdoored). I sincerely wish this wasn't the case.


Replies

simonwyesterday at 7:03 PM

That $200/month is already more like $4,000/month if you have to pay full API pricing - "enterprise" companies for example. That drops the equivalent to 10 months.

(I'd be surprised if that local rig really can drive the equivalent of $4,000/month of API spend though, given that a local rig can run prompts in parallel a lot less effectively than Anthropic's many data centers.)

neverm0r3yesterday at 9:26 PM

I agree with your point, but it should be noted that this assumes consistent prices for LLMs. The OpenAIs and Anthropics of this world are still selling the plans at a subsidised prices with the power of VCs, who are going to want that return some time.

verdvermyesterday at 7:05 PM

You can use a lot more tokens on hardware than you can spend on a $200/m plan.

Inwrnt through 1B tokens my first month with an OEM spark. That's more than $1k of opus. Not a fair comparison, because token patterns are different, but since that time I have also seen a 2-3x improvement in then speeds.from improvements in vllm (mainly MTP). DiffusionGemma is around 4x regular gemma.

echelonyesterday at 7:04 PM

Stop trying to run them locally, folks.

You don't own your fiber connection. So why try to own another rapidly depreciating, expensive, and annoying asset?

Rent cloud GPUs!

You get to participate in the ownership, data control, price control, and hacking culture without having to Frankenstein some hobbyist box that costs a ton, is distilled down to functional uselessness, and is a PITA to maintain.

show 3 replies