A developer can blast millions of tokens in minutes. When you have a context size of 250k that’s just 4 queries. But with tool usage and subsequent calls etc it can easily just do many millions in one request
But if you just ask a question or something it’ll take a while to spend a million tokens…
Seems like an opportunity to condense the context into 'documentation' level and only load the full text/code for files that expect to be edited?