Just tried hf.co/unsloth/DeepSeek-R1-Distill-Qwen-14B-GGUF:Q4_K_M on Ollama and my oh my are these models chatty. They just ramble on for ages.
I noticed the smaller the model (be it quant or parameters as the cause) the faster it'd run.... but the longer it'd fight itself. For the same Calc II level problem all models were able to eventually get an answer but the distilled Qwen-32B at Q6 quant was fastest to actual answer completion.
They need to be trained with a small length penalty
I find the qwq 32B a bit like that. I asked for a recipe for something in minecraft 1.8, and it was page after page of 'hmm, that still doesnt look right, maybe if I try...' although to be fair I did ask for an ascii art diagram for the result. It will be interesting to try a DeepSeek 32B qwq if that is planned, because otherwise pretty happy with it.
I just wish that less development chat was happening within walled gardens because none of these seem to be much help with Zig.