You're most likely bottlenecked by memory bandwidth for a LLM. The AMD AI MAX 395+ gives you ...

jychang • last Sunday at 10:27 AM • 2 replies • view on HN

You're most likely bottlenecked by memory bandwidth for a LLM.

The AMD AI MAX 395+ gives you 256GB/sec. The M4 gives you 120GB/s, and the M4 Pro gives you 273GB/s. The M4 Max: 410GB/s (14‑core CPU/32‑core GPU) or 546GB/s (16‑core CPU/40‑core GPU).

Replies

zargon • last Sunday at 5:43 PM

It’s both. If you’re using any real amount of context, you need compute too.

cdavid • last Sunday at 1:24 PM

Yeah, memory bandwidth is often the limitation for floating point operations.

alt Hacker News

Replies