logoalt Hacker News

zozbot234yesterday at 3:41 PM2 repliesview on HN

Dedicated GPU VRAM is much scarcer than the unified RAM you get on Mac platforms. This is a big deal for SOTA LLMs that combine high memory footprint with a need for high memory bandwidth in order to get acceptable performance.


Replies