When the Unsloth quant of the flash model does appear, it should show up as unsloth/... ...

a_e_k • yesterday at 5:25 PM • 2 replies • view on HN

When the Unsloth quant of the flash model does appear, it should show up as unsloth/... on this page:

https://huggingface.co/models?other=base_model:quantized:zai...

Probably as:

https://huggingface.co/unsloth/GLM-4.7-Flash-GGUF

Replies

homarp • yesterday at 5:33 PM

it'a a new architecture. Not yet implemented in llama.cpp

issue to follow: https://github.com/ggml-org/llama.cpp/issues/18931

dumbmrblah • yesterday at 5:33 PM

One thing to consider is that this version is a new architecture, so it’ll take time for Llama CPP to get updated. Similar to how it was with Qwen Next.

➕ show 1 reply

alt Hacker News

Replies