I'm not surprised to see competition with Blackwell. Rubin is 5x faster than Blackwell at infer...

Schiendelman • today at 12:52 AM • 2 replies • view on HN

I'm not surprised to see competition with Blackwell. Rubin is 5x faster than Blackwell at inference - Blackwell is the last generation Nvidia didn't optimize specifically for inference.

If I'm missing something, please let me know!

Replies

boroboro4 • today at 3:35 AM

It's very unclear what's special in Rubin to be optimized for inference? I can see disaggregated bit (with having separate prefill and decoding nodes), but what else?

➕ show 1 reply

nullc • today at 1:15 AM

how do you get 5x faster at inference when inference is memory bandwidth limited? getting 5x the memory bandwidth of a h100 seems physically difficult.

➕ show 2 replies

alt Hacker News

Replies