logoalt Hacker News

simonw01/21/20252 repliesview on HN

I'm using an M2 64GB MacBook Pro. For the Llama 8B one I would expect 16GB to be enough.

I don't have any experience running models on Windows or Linux, where your GPU VRAM becomes the most important factor.


Replies

dragonwriter01/21/2025

On Windows or Linux you can run from RAM or split layers between RAM and VRAM; running fully on GPU is faster than either of those, but the limit on what you can run at all isn’t VRAM.

show 1 reply
rane01/21/2025

Why isn't GPU VRAM a factor on a Silicon mac?

show 1 reply