do you pay for the full context every prompt? what happened with the idea of caching the context server side?
You don't. Most of the time (after the first prompt following a compaction or context clear) the context prefix is cached, and you pay something like 10% of the cost for cached tokens. But your total cost is still roughly the area under a line with positive slope. So increases quadratically with context length.
It helps a ton but it doesn't last forever and you still have to pay to write to the cache