Anthropic measures your usage based on token consumption
We are paying for a certain amount of token consumption
Why then, is this an outsized strain on your system Anthropic?
It's like buying gasoline from Shell, and then Shell's terms of services forcing you to use that gas in a Hummer that does 5 MPG, while everyone else wants to drive any other vehicle.
I feel icky replying in favor of a for-profit entity, but here goes ..
> We are paying for a certain amount of token consumption
I dont think you are. The specific arrangement you have is you pay for a subscription to be used with Claude Code. It isnt access to tokens, so you can do whatever you please.
---
An analogy would be a refillable cup for a soda at a restuarnt. They will allow you to refill how many ever times you want, but only using the store provided cup - and you cant bring your own 2L hydroflask or whatever. You're paying not just for the liquid, but for the entire setup.
You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve.
If you're on a subscription plan, you're paying for a certain amount of maximum token consumption. Mass market consumers generally prefer this model to one where they're billed for actual usage. But making it work requires statistical estimates of how much people will consume, which often requires excluding third party tools that circumvent those estimates.
To use your analogy, if Shell sold you a subscription to fill up your Hummer up to 30 times a month, they wouldn't let you use that subscription to fill gas cans with a GMC logo taped to the side. They couldn't, without overcharging the people who just want to average out their cost of driving.