logoalt Hacker News

nextaccountictoday at 1:35 AM2 repliesview on HN

do you pay for the full context every prompt? what happened with the idea of caching the context server side?


Replies

weird-eye-issuetoday at 4:45 AM

It helps a ton but it doesn't last forever and you still have to pay to write to the cache

davesquetoday at 1:49 AM

You don't. Most of the time (after the first prompt following a compaction or context clear) the context prefix is cached, and you pay something like 10% of the cost for cached tokens. But your total cost is still roughly the area under a line with positive slope. So increases quadratically with context length.