Anthropic measures your usage based on token consumption We are paying for a certain amount of tok...

eagleinparadise • yesterday at 11:03 PM • 4 replies • view on HN

Anthropic measures your usage based on token consumption

We are paying for a certain amount of token consumption

Why then, is this an outsized strain on your system Anthropic?

It's like buying gasoline from Shell, and then Shell's terms of services forcing you to use that gas in a Hummer that does 5 MPG, while everyone else wants to drive any other vehicle.

Replies

SpicyLemonZest • yesterday at 11:29 PM

If you're on a subscription plan, you're paying for a certain amount of maximum token consumption. Mass market consumers generally prefer this model to one where they're billed for actual usage. But making it work requires statistical estimates of how much people will consume, which often requires excluding third party tools that circumvent those estimates.

To use your analogy, if Shell sold you a subscription to fill up your Hummer up to 30 times a month, they wouldn't let you use that subscription to fill gas cans with a GMC logo taped to the side. They couldn't, without overcharging the people who just want to average out their cost of driving.

➕ show 1 reply

bitpush • yesterday at 11:23 PM

I feel icky replying in favor of a for-profit entity, but here goes ..

> We are paying for a certain amount of token consumption

I dont think you are. The specific arrangement you have is you pay for a subscription to be used with Claude Code. It isnt access to tokens, so you can do whatever you please.

---

An analogy would be a refillable cup for a soda at a restuarnt. They will allow you to refill how many ever times you want, but only using the store provided cup - and you cant bring your own 2L hydroflask or whatever. You're paying not just for the liquid, but for the entire setup.

➕ show 1 reply

charcircuit • yesterday at 11:44 PM

You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve.

➕ show 1 reply

alt Hacker News

Replies