Claude subscription is equivalant of spot instance
And APIs are on-demand service equivalant.
Priority is set to APIs and leftover compute is used by Subscription Plans.
When there is no capacity, subscriptions are routed to Highly Quantized cheaper models behind the scenes.
Selling subscription makes it cheaper to run such inference at scale otherwise many times your capacity is just sitting there idle.
Also, these subscription help you train your model further on predictable workflow (because the model creators also controls the Client like qwen code, claude code, anti gravity etc...)
This is probably why they will ban you for violating TOS that you cannot use their subscription service model with other tools.
They aren't just selling subscription, but the subscription cost also help them become better at the thing they are selling which is coding for coding models like Qwen, Claude etc...
I've used qwen code, codex and claude.
Codex is 2x better than Qwen code and Claude is 2x better than Codex.
So I'd hope the Claude Opus is atleast 4-5x more expensive to run than flagship Qwen Code model hosted by Alibaba.
> Claude is 2x better than Codex
This hasn't been true in a long time.
[dead]
> When there is no capacity, subscriptions are routed to Highly Quantized cheaper models behind the scenes.
Have they announced this?