logoalt Hacker News

jmyeettoday at 6:56 PM0 repliesview on HN

Apple offers relatively affordable options for a high-memory workstation that uses unified memory. They previously offered 256/512GB Mac Studios (both discontinued). Because of this they can keep larger models in memory.

BUT you just can't compete with NVidia performance for LLM workloads (mostly inference) for two reasons:

1. The memory bandwidth just can't compete with a 5090 (1800GB/s). The best current Mac is ~900GB/s. That directly caps tokens/sec and might be manageable but there's another problem; and

2. The raw FLOPS just can't compete with even a 5090. It probably needs to natively support FP4/FP8 to at least maintain a number format parity with NVidia. But beside that, NVidia just has more raw FLOPS.

According to Google, an M5 Max does ~70 FP16 TFLOPS while a 5090 does 380. If Apple can close that gap to at least be competitive and also hold larger models in shared VRAM, that would be a competitive advantage and it would directly attack NVidia's market segmentation.

The Mac Studio last came out March last year. So we may get an update in Q3. Many are pinning their hopes on this. But it might not happen until next year. When it was released the M4 was the state of the art and it came with either the M4 Max or M3 Ultra (which, as I understand it, is basically 2 M3s stuck together, kind of). What people are hoping for is an M5 Ultra with >1000GB/s of memory bandwidth, ideally 200+ FP16 TFLOPS and hopefully FP4/FP4 support.

You can chain Mac Studios together into a cluster with TB5 too.

But it's reasonably likely that the next Mac Studio will be only incrementally better than the last generation.