Already quantized/converted into a sane format by Unsloth:

homebrewer • today at 2:12 PM • 7 replies • view on HN

https://huggingface.co/unsloth/Qwen3.6-35B-A3B-GGUF

Replies

Unsloth is great for uploading quants quickly to experiment with, but everyone should know that they almost always revise their quants after testing.

If you download the release day quants with a tool that doesn’t automatically check HF for new versions you should check back again in a week to look for updated versions.

Some times the launch day quantizations have major problems which leads to early adopters dismissing useful models. You have to wait for everyone to test and fix bugs before giving a model a real evaluation.

➕ show 5 replies

torginus • today at 5:48 PM

Why doesn't Qwen itself release the quantized model? My impression is that quantization is a highly nontrivial process that can degrade the model in non-obvious ways, thus its best handled by people who actually built the model, otherwise the results might be disappointing.

Users of the quantized model might be even made to think that the model sucks because the quantized version does.

➕ show 2 replies

palmotea • today at 2:40 PM

How much VRAM does it need? I haven't run a local model yet, but I did recently pick up a 16GB GPU, before they were discontinued.

➕ show 5 replies

sander1095 • today at 3:58 PM

I sense that I don't really understand enough of your comment to know why this is important. I hope you can explain some things to me:

- Why is Qwen's default "quantization" setup "bad" - Who is Unsloth? - Why is his format better? What gains does a better format give? What are the downsides of a bad format? - What is quantization? Granted, I can look up this myself, but I thought I'd ask for the full picture for other readers.

➕ show 3 replies

halJordan • today at 6:45 PM

There's absolutely nothing wrong it insane with a safetensors file. It might be less convenient than a single file gguf. But that's just laziness not insanity

txtsd • today at 2:35 PM

So I can use this in claude code with `ollama run claude`?

➕ show 3 replies

terataiijo • today at 2:36 PM

lmao they are so fast yooo

➕ show 3 replies

alt Hacker News

Replies