It works better, until you run out of tokens. Running out of tokens is something that used to neve...

bryanlarsen • yesterday at 3:29 PM • 1 reply • view on HN

It works better, until you run out of tokens. Running out of tokens is something that used to never happen to me, but this month now regularly happens.

Maybe I could avoid running out of tokens by turning off 1M tokens and max effort, but that's a cure worse than the disease IMO.

Replies

cube2222 • yesterday at 6:27 PM

I would risk a guess that people have a wrong intuition about the long-context pricing and are complaining because of that.

Yeah, the per-token price stays the same, even with large context. But that still means that you're spending 4x more cache-read tokens in a 400k context conversation, on each turn, than you would be in a 100k context conversation.

alt Hacker News

Replies