Because in my mind, as a person not working directly on this kind of stuff, I figured that caching w...

wesammikhail • yesterday at 12:07 PM • 0 replies • view on HN

Because in my mind, as a person not working directly on this kind of stuff, I figured that caching was done similar to any resource caching in a webserver environment.

It´s a semantics issue where the word caching is overloaded depending on context. For people that are not familiar with the inner workings of llm models, this can cause understandable confusion.

alt Hacker News