logoalt Hacker News

pradeep1177today at 8:07 PM1 replyview on HN

So, how are you handling read/write caching? I mean, if I keep routing the next prompt based on the task weights? How about if I'm sending every 5th query to opus, which do expensive write cache?


Replies

adchurchtoday at 9:34 PM

We consider the cost of missing the cache when making each routing decision after the initial one. Discussed in a bit more depth here: https://news.ycombinator.com/item?id=48689448