It’s wild that cached tokens count full - what’s in it for you to care about caching at all then? Is the processing speed gain significant?
Not really worth it, in general. It does reduce latency a little. In practice, you do have a continuing context, though, so you end up using it whether you care or not.
Not really worth it, in general. It does reduce latency a little. In practice, you do have a continuing context, though, so you end up using it whether you care or not.