on the Will It Mythos benchmark, small models are punching way above their weight(s) gemma4-26B (#...

verdverm • today at 1:18 AM • 1 reply • view on HN

on the Will It Mythos benchmark, small models are punching way above their weight(s)

gemma4-26B (#7)

qwen-3.6-27B (#9)

https://news.ycombinator.com/item?id=48640196

Replies

I've tried running qwen 3.6 locally and it felt like LLMs a year ago where you can get them to do some stuff but the tasks have to be very small and you have to course correct them a lot to the point it's hard to say it's any faster than doing it all yourself.

Certainly the gap is closing but I feel it still makes more sense to pay pennies to run the full sized open models hosted on much better hardware.

alt Hacker News

Replies