From what I understand you shouldn't wait more than 5min between prompts without compacting or clearing or you'll pay for reinitializing the cache. With compaction you still pay but it's less input tokens. (Is compaction itself free?)
Yeah the caching change is probably 90% of “i run out of usage so fast now!” Issues.
is it 5 mins between constant prompting/work or 5 mins as in if i step away from the comp for 5 mins and comp back and prompt again im not subject to reinit?
if it's the latter that's crazy. i dont even know what to do there, compactions already feel like a memory wipe
Ah I can see how my phrasing might be misleading, but these prompts were made within 5 minutes of each other, the timing I mentioned were what Claude spent working.
>pay for reinitializing the cache
Why can't they save the kv cache to disk then later reload it to memory?