logoalt Hacker News

Thaxlltoday at 1:03 AM2 repliesview on HN

M3 max tflops is tiny compared to the 12k box. It's not even comparable.


Replies

davejtoday at 6:03 AM

It is very comparable if you work out the $/tok/s on inference. I did some napkin math and it looks like you’re getting roughly 3x the performance for 3x the cost. Red v2 vs Mac Studio M3 Ultra 96GB.

If you compare tokens/kWh efficiency then my math has Mac Studio being about 1.5x more efficient.

zozbot234today at 1:08 AM

M3 has tolerable decode performance for the price, and that's what people would care about most of the time. they underperform severely wrt. prefill, but that's a fraction of the workload. AI, even agentic AI, spends most of its time outputing tokens, not processing context in bulk.