logoalt Hacker News

lambdayesterday at 5:59 PM1 replyview on HN

I've run several local models that get this right. Qwen 3.5 122B-A10B gets this right, as does Gemma 4 31B. These are local models I'm running on my laptop GPU (Strix Halo, 128 GiB of unified RAM).

And I've been using this commonly as a test when changing various parameters, so I've run it several times, these models get it consistently right. Amazing that Opus 4.7 whiffs it, these models are a couple of orders of magnitude smaller, at least if the rumors of the size of Opus are true.


Replies

qingcharlesyesterday at 6:40 PM

Does Gemma 4 31B run full res on Strix or are you running a quantized one? How much context can you get?

show 1 reply