logoalt Hacker News

2ndorderthoughtyesterday at 5:07 PM1 replyview on HN

1t model instances(opus, gpt,etc) are not running on a single GPU. The catch is how the cards communicate and how the model is broken up. There's a bit that goes into it but the answer is yes the more gpus the bigger the model you can run.


Replies

ryandrakeyesterday at 5:34 PM

Really cool. I'm very much still learning about this stuff. Sounds like this inter-GPU communication is a feature of special hardware (not consumer GPUs).

show 1 reply