You are making the false assumption that all token consumption costs the same when it doesn't. ...

charcircuit • yesterday at 11:44 PM • 1 reply • view on HN

You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve.

Replies

stavros • yesterday at 11:47 PM

They have limits. I don't care how expensive it is to serve, I'm paying them for a given amount of tokens (a limit which THEY SET) and they want to also dictate where I spend those tokens.

➕ show 2 replies

alt Hacker News

Replies