Yes I usually run Unsloth models, however you are linking to the big model now (355B-A32B), which I ...

dajonker • yesterday at 4:43 PM • 2 replies • view on HN

Yes I usually run Unsloth models, however you are linking to the big model now (355B-A32B), which I can't run on my consumer hardware.

The flash model in this thread is more than 10x smaller (30B).

Replies

a_e_k • yesterday at 5:25 PM

When the Unsloth quant of the flash model does appear, it should show up as unsloth/... on this page:

https://huggingface.co/models?other=base_model:quantized:zai...

Probably as:

https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF

➕ show 2 replies

latchkey • yesterday at 4:47 PM

There are a bunch of 4bit quants in the GGUF link and the 0xSero has some smaller stuff too. Might still be too big and you'll need to ungpu poor yourself.

➕ show 1 reply

alt Hacker News

Replies