Amazing article. I was under the misapprehension that temp and other output parameters actually do affect caching. Turns out I was wrong and this explains why beautifully.
Great work. Learned a lot!
I had a “somebody is wrong on the internet!!” discussion about exactly this a few weeks ago, and they proclaimed to be a professor in AI.
Where do people get the idea from that temperature affects caching in any way? Temperature is about next token prediction / output, not input.
Yay, glad I could help! The sampling process is so interesting on its own that I really want to do a piece on it as well.