M3 max tflops is tiny compared to the 12k box. It's not even comparable.

Thaxll • today at 1:03 AM • 2 replies • view on HN

Replies

It is very comparable if you work out the $/tok/s on inference. I did some napkin math and it looks like you’re getting roughly 3x the performance for 3x the cost. Red v2 vs Mac Studio M3 Ultra 96GB.

If you compare tokens/kWh efficiency then my math has Mac Studio being about 1.5x more efficient.

zozbot234 • today at 1:08 AM

M3 has tolerable decode performance for the price, and that's what people would care about most of the time. they underperform severely wrt. prefill, but that's a fraction of the workload. AI, even agentic AI, spends most of its time outputing tokens, not processing context in bulk.

alt Hacker News

Replies