I habe come across turning on caching means the llm has a faint memory of what was in the cache, even to unrelated queries. If this is the case its fully unreasonable to share the cache, because of possibility of information leakage.
How would information leak, though? There’s no difference in the probability distribution the model outputs when caching vs not caching.
This is absolutely 100% incorrect.