Sounds like a game changer if I see that kind of speed up on my hardware. So far I've prefered ...

regexorcist • yesterday at 7:03 PM • 1 reply • view on HN

Sounds like a game changer if I see that kind of speed up on my hardware. So far I've prefered Qwen 3.6 because of its better tool handling, even though Gemma 4 is faster, but I saw they've updated the model template and that's supposed to be better now. Looking forward to trying this with llama.cpp.

Replies

ch_sm • yesterday at 7:46 PM

gemma4 has a specific problem with toolcalls that affects most runtimes. fixes for ollama and vllm are being worked on right now

➕ show 2 replies

alt Hacker News

Replies