For those interested, made some Dynamic Unsloth GGUFs for local deployment at https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF and made a guide on using Claude Code / Codex locally: https://unsloth.ai/docs/models/qwen3-coder-next
Hi Daniel, I've been using some of your models on my Framework Desktop at home. Thanks for all that you do.
Asking from a place of pure ignorance here, because I don't see the answer on HF or in your docs: Why would I (or anyone) want to run this instead of Qwen3's own GGUFs?
Still hoping IQuest-Coder gets the same treatment :)
What is the difference between the UD and non-UD files?
Good results with your Q8_0 version on 96GB RTX 6000 Blackwell. It one-shotted the Flappy Bird game and also wrote a good Wordle clone in four shots, all at over 60 tps. Thanks!
Is your Q8_0 file the same as the one hosted directly on the Qwen GGUF page?
Nice! Getting ~39 tok/s @ ~60% GPU util. (~170W out of 303W per nvtop).
System info:
llama.cpp command-line: