Yes.
-hfd for the draft model.
Nice, was wondering if there was a flag for the draft as well.
Not knocking huggingface-cli, just find it's much easier for people to try out this stuff when they can just
mise use --global github:ggml-org/llama.cpp LLAMA_CACHE="models" llama-server \ -hf unsloth/gemma-4-26B-A4B-it-qat-GGUF:UD-Q4_K_XL \ --host 0.0.0.0 \ --port 11434 \ ...
Nice, was wondering if there was a flag for the draft as well.
Not knocking huggingface-cli, just find it's much easier for people to try out this stuff when they can just