Huh, according to that model card this is a 137B total parameter model.
Performance doesn't seem that good:
- MAI-Code-1-Flash (137B-A5B) = 51% on SWE-bench pro
- Qwen3.6-35B-A3B = 49.5% on SWE-bench pro (https://huggingface.co/Qwen/Qwen3.6-35B-A3B)
They benchmark against Claude Haiku but Haiku is not good, it's worse than tiny open models you can run locally or via API at 10% the cost.
> 137B-A5B
Yeah, not a 5B param model as the earlier title implied!
[dead]
The take away is that this model is a smaller model that competes with Haiku, I would hope they come out with a "Sonnet" competing model, then Opus. I have been wondering why Microsoft is kind of "sleeping" on offering models they themselves have made on Copilot, maybe it was part of their deal with OpenAI? Not sure.