logoalt Hacker News

mitjam01/15/20267 repliesview on HN

As a heavy user of OpenAI, Anthropic, and Google AI APIs, I’m increasingly tempted to buy a Mac Studio (M3 Ultra or M4 Pro) as a contingency in case the economics of hosted inference change significantly.


Replies

utopiah01/15/2026

Don't buy anything physical, benchmark the models you could run on your potential hardware on (neo) cloud provider like HuggingFace. Only if you believe the quality is up to your expectation then do it. The test itself should take you $100 and few hours top.

show 1 reply
mohsen101/15/2026

the thing is GLM 4.7 is easily doing the work Opus was doing for me but to run it fully you'll need a much bigger hardware than a Mac Studio. $10k buys you a lot of API calls from z.ai or Anthropic. It's just not economically viable to run a good model at home.

show 2 replies
pram01/15/2026

FWIW the M5 appears to be an actual large leap for LLM inference with the new GPU and Neural Accelerator. So id wait for the Pro/Max before jumping on M3 Ultra.

show 1 reply
boredatoms01/15/2026

If theres a market crash, there could be a load cheap H100s hitting ebay

show 1 reply
mifreewil01/15/2026

You'd want to get something like a RTX Pro 6000 (~ $8,500 - $10,000) or at least a RTX 5090 (~$3,000). That's the easiest thing to do or cluster of some lower-end GPUs. Or a DGX Spark (there are some better options by other manufacturers than just Nvidia) (~$3000).

show 1 reply
storus01/15/2026

M3 Ultra with DGX Spark is right now what M5 Ultra will be in who knows when. You can just buy those two, connect them together using Exo and have M5 Ultra performance/memory right away. Who knows what M5 Ultra will cost given RAM/SSD price explosion?

show 1 reply
PlatoIsADisease01/15/2026

There is a reason no one uses Apple for local models. Be careful not to fall for marketing and fanboyism.

Just look at what people are actually using. Don't rely on a few people who tested a few short prompts with short completions.

show 1 reply