Can you speak more to how efficiency towards context management works (to reduce token costs)? Or are you loading up context to the brim with each request?
I think managing context is the most important aspect of today's coding agents. We pick only files we think would be relevant to the user request and add those. We generally pull more files than Cursor, which I think is an advantage.
However, we also try to leverage prompt-caching as much as possible to lower costs and improve latency.
So we basically only add files over time. Once context gets too large, it will purge them all and start again.
I think managing context is the most important aspect of today's coding agents. We pick only files we think would be relevant to the user request and add those. We generally pull more files than Cursor, which I think is an advantage.
However, we also try to leverage prompt-caching as much as possible to lower costs and improve latency.
So we basically only add files over time. Once context gets too large, it will purge them all and start again.