We do both: We compress tool outputs at each step, so the cache isn't broken during the run. ...

thebeas • yesterday at 8:22 PM • 0 replies • view on HN

We do both:

We compress tool outputs at each step, so the cache isn't broken during the run. Once we hit the 85% context-window limit, we preemptively trigger a summarization step and load that when the context-window fills up.

alt Hacker News