> Sonnet 4.5, starting at $3/$15 per million tokens.
Are people really willing to pay these prices? The open-weight models are catching up in a rapid pace while keeping the prices so low. MiniMax M2.5, Kimi 2.5 and GLM-5 is dirt cheap compared to this. They may not be sota but they are more than good enough.
It depends on how much you value the gap between “pretty good” and SOTA… I’ve noticed that Opus is more “expensive”,” but an error-filled rabbit hole is expensive too!
Some people will want the models like claude where you don't have to be super-specific and it will infer exactly what you mean.
With the GLM models you have to confirm with it exactly what you want, and not miss any detail.
I made my own benchmarks, very basic questions, and Claude 4.6 is actually worse than the free Stepfun 3.5 version: https://aibenchy.com
It is smart, but it fails at basic instruction following sometimes.
I remember this is a Claude thing for quite a while, where I kept trying to make it output just JSON (without structured output), and it always kept adding quotes or new lines.