I wonder if (when) there will be a GGUF model available for this 8B model. I want to try it out loca...

bochoh • 01/20/2025 • 2 replies • view on HN

I wonder if (when) there will be a GGUF model available for this 8B model. I want to try it out locally in Jan on my base m4 Mac mini. I currently run Llama 3 8B Instruct Q4 at around 20t/s and it sounds like this would be a huge improvement in output quality.

Replies

bugglebeetle • 01/20/2025

YC’s own incredible Unsloth team already has you covered:

https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B

DrPhish • 01/20/2025

Making your own ggufs is trivial: https://rentry.org/tldrhowtoquant/edit

It's a bit harder when they've provided the safetensors in FP8 like for the DS3 series, but these smaller distilled models appear to be BF16, so the normal convert/quant pipeline should work fine.

➕ show 2 replies

alt Hacker News

Replies