They have to know that this could bite them and to ask the question first.
I do think having some insight into the current state of the cache and a realistic estimate for prompt token use is something we should demand.
I do think having some insight into the current state of the cache and a realistic estimate for prompt token use is something we should demand.