logoalt Hacker News

kangyesterday at 9:13 PM0 repliesview on HN

> tokens written to cache all at once, which would eat up a significant % of your rate limits

Construction of context is not an llm pass - it shouldn't even count towards token usage. The word 'caching' itself says don't recompute me.

Since the devs on HN (& the whole world) is buying what looks like nonsense to me - what am I missing?