logoalt Hacker News

segmondyyesterday at 9:38 PM1 replyview on HN

You can run it on a mac studio with 512gb ram, that's the easiest way. I run it at home on a multi rig GPU with partial offload to ram.


Replies

johndoughyesterday at 9:51 PM

I was wondering whether multiple GPUs make it go appreciably faster when limited by VRAM. Do you have some tokens/sec numbers for text generation?