logoalt Hacker News

Havoctoday at 5:11 PM2 repliesview on HN

I would think a quantized 27b should be doable in mac world too?


Replies

aegis_cameratoday at 5:14 PM

My prefer is LFM 450M for vision task, QWEN 9B Q4 for Orchestration

HanClintotoday at 5:39 PM

Yeah, but it can be a bit of a tight squeeze if you don't have at least 24gb (preferably 32gb+) of memory.

Especially if you want other apps to run at the same time, I think it's safer to stick with something more like 9b. You can see a table with quantized sizes here [0] -- yes, there are smaller quants than Q4_K_XL, but then you're down in the weeds with nickel-and-diming things, and if you want to even keep something like a (memory-hungry) instance of VSCode running, good luck.

IMO -- if 9b is doing the job, stick with 9b.

0 - https://github.com/ggml-org/LlamaBarn/pull/63