Really depends on the repo you’re working in.
If it’s very large, especially if the tool needs to refer to documentation for a lot of custom frameworks and APIs, you often end up needing very large context windows that burn through tokens faster.
If it’s smaller or sticks with common frameworks that the model was trained on, it’s able to do a lot more with smaller context windows and token usage is way lower.
I'm currently in repos where the context window required is so large that the output is almost always "wrong" for the problem at hand. Quite a few people at my company burn through tokens this way, and it certainly isn't providing value to the company.
Begs the question if we should move on to minimal microservices so that whole project lives in context of llm. I hardly have to do anything when I'm working with small project with llm.
Yes, in a reasonable microservice land where the places you need to connect to are all documented in very concise places, you have have extremely productive $10 days. In the giant monorepo with everything custom, you can't just rely on built in knowledge of 80% of you libraries, so it's a very different world.
A place like Google has to be so much better off just training library concepts in, given how much of the things the LLM will "instinctively" reach for are unlikely to be available. Not unlike the acclimation period what happens when someone comes in or out of a company like that, and suddenly every library and infra tool you were used to are just not available. We need a lot more searching when that happens to us, and the LLM suffers from the same context issue. The human just has all of that trained in after a 6 months, but the LLM doesn't.
On larger repos it spends a lot of time just finding the one line of code that needs to change. (I have the same problem, as a human!)
Will this result in people moving away from large monorepos to per-unit, quasi-micro repositories to save in token use?
So if the AI could do the same work on huge codebases with far fewer tokens, would it be good or bad for the AI companies do you think?
The codebase and the topic you're working on are huge variables.
I don't use LLMs to write code (other than simple refactors and throwaway stuff) but I do use them heavily to crawl through big codebases and identify which files and functions I need to understand.
Some of the codebases I explore will burn through tokens at a rapid rate because there is so much complex code to get through. If I use the $20 Claude plan and Opus I can go through my entire 5-hour allocation in a single prompt exploring the codebase some times, and it's justified.
Other times I'm working on simple topics, even in a large codebase, and it will sip tokens because it only needs to walk a couple files to get to what it needs to answer my questions.