1t model instances(opus, gpt,etc) are not running on a single GPU. The catch is how the cards commun...

2ndorderthought • yesterday at 5:07 PM • 1 reply • view on HN

1t model instances(opus, gpt,etc) are not running on a single GPU. The catch is how the cards communicate and how the model is broken up. There's a bit that goes into it but the answer is yes the more gpus the bigger the model you can run.

Replies

ryandrake • yesterday at 5:34 PM

Really cool. I'm very much still learning about this stuff. Sounds like this inter-GPU communication is a feature of special hardware (not consumer GPUs).

➕ show 1 reply

alt Hacker News

Replies