logoalt Hacker News

H8crilAtoday at 10:45 AM3 repliesview on HN

How do you run this kind of a model at home? On a CPU on a machine that has about 1TB of RAM?


Replies

pixelpoettoday at 11:21 AM

Wow, it's 690GB of downloaded data, so yeah, 1TB sounds about right. Not even my two Strix Halo machines paired can do this, damn.

Gracanatoday at 11:30 AM

You can do it slowly with ik_llama.cpp, lots of RAM, and one good GPU. Also regular llama.cpp, but the ik fork has some enhancements that make this sort of thing more tolerable.

bertilitoday at 12:12 PM

Two 512GB Mac Studios connected with thunderbolt 5.