logoalt Hacker News

nzeidyesterday at 7:34 PM2 repliesview on HN

A few days ago I switched again from Qwen3.6 to Gemma 4 - for personal use I've experienced better average performance with the 26B version of the latter than the 27B of the former.

For someone who's been running local models for a long while, these are very very exciting times.


Replies

girvotoday at 12:42 AM

Oh that's fascinating. 3.6 27B is pretty damned good, but slow in wall-clock times on my DGX Spark-alike. It generates huge reams of thinking before it gets the (usually correct!) answer, so wall-clock time is rough for tasks even at ~20tk/s

I'm surprised the 26B-A4B is better? It should be faster too, interesting. I'm excited to try 31B with MTP, because MTP-2 is what makes 27B bearable on the GB10.

What are you using it for? Agent-based coding, or something else?

apexalphayesterday at 8:25 PM

I’ve been swapping between these too as well.

However I find qwen unbeatable for toolcallling. I think gemma wasnt trained on that at all.

show 3 replies