> Didn't expect to go back to macOS but their basically the only feasible consumer option fo...

NiloCK • yesterday at 2:28 PM • 3 replies • view on HN

> Didn't expect to go back to macOS but their basically the only feasible consumer option for running large models locally.

I presume here you are referring to running on the device in your lap.

How about a headless linux inference box in the closet / basement?

Return of the home network!

Replies

Aurornis • yesterday at 2:32 PM

Apple devices have high memory bandwidth necessary to run LLMs at reasonable rates.

It’s possible to build a Linux box that does the same but you’ll be spending a lot more to get there. With Apple, a $500 Mac Mini has memory bandwidth that you just can’t get anywhere else for the price.

➕ show 5 replies

jannniii • yesterday at 2:42 PM

Indeed and I got two words for you:

Strix Halo

➕ show 2 replies

mythz • yesterday at 2:54 PM

Not feasible for Large models, it takes 2x M3 512GB Ultra's to run the full Kimi K2.5 model at a respectable 24 tok/s. Hopefully the M5 Ultra will can improve on that.

alt Hacker News

Replies