Indeed, it performed worse than Qwen3.6-27b in my basic test.
It gave a fancier looking answer, but did a worse job following the prompt.
Roughly my experience so far; it trips up on itself a bit.
However, it's much more inclined to do web search unprompted, which is fascinating in its own way.
Roughly my experience so far; it trips up on itself a bit.
However, it's much more inclined to do web search unprompted, which is fascinating in its own way.