There’s a case for intelligent caching: coarse grained 1h and 5min type TTls are not optimal.

simianwords • today at 9:07 AM • 1 reply • view on HN

Replies

Caching LLM is not like caching normal content; the longer it is the more beneficial it is and it only stops being worth when user stops current session.

So you'd need some adaptive algorithm to decide when to keep caching and when to purge it whole, possibly on client side, but if you give client the control, people will make it use most cache possible just to chase diminishing returns. So fine grained control here isn't all that easy; other possible option is just to have cache size per account and then intelligently purge it instead of relying just on TTL

➕ show 2 replies

alt Hacker News

Replies