> Once the codebase has become fully agentic, i.e., only agents fundamentally understand it
What exactly do we mean this? Because it is obviously common for human coders to tackle learning how an unfamiliar and complex codebase works so that they can modify it (new hires do it all the time). I can think this means one of two things:
* The code and architecture being produced by agents takes approaches that are abnormally complex or inscrutable to human reviewers. Is that what folks working with cutting edge agents are seeing? In which case, such code obviously isn’t beeping reviewed; it can’t be.
* the code and architecture being produced by agents can still be understood by human reviewers, but it isn’t actually being reviewed by anyone — since reviewing pull requests isn’t always fun or easy, and injecting in-depth human review slows everything down a lot — and so no one understands how the code works. (I keep thinking about the AI maximalist who recently said he woke up to 75 pull requests from his agent, like that was a good thing)
And maybe it’s a combination of the two: agent-generated pull requests are incrementally harder to grok, which makes reviewing more painful and take longer, which means more of them go without in-depth reviews.
But if your claim is true, the bottom line is that it means no one is fully reviewing code produced by agents.
> What exactly do we mean this? Because it is obviously common for human coders to tackle learning how an unfamiliar and complex codebase works so that they can modify it (new hires do it all the time).
I agree with you, BUT: I find it much harder to get my head around a medium sized vibe coded project than a medium size bespoke coded project. It's not even close.
I don't know what codebases will look like if/when they become "fully agentic". Right now, LLM-agents get worse, not better, as a codebase grows, and as more if it is coded (or worse architected) by LLM.
Humans get better over time in a project and LLMs get worse, and this seems fundamental to the LLM architecture really. The only real way I see for codebases to become fully agentic right now is if they're small enough. That size grows as context sizes that new models can deal with grows.
If that's how this plays out - context windows get large enough that LLM-agents can work fine in perpetuity in medium or large size projects - I wonder if the resulting projects will be extremely difficult for humans to wrap their heads around. That is, if the LLM relies on looking at massive chunks of the codebase all at once, we could get to the point of fully agentic codebases without having to tackle the problem of LLMs being terrible at architecture, because they don't need it.
For your points:
- Garden path approaches are definitely a thing, but I don't think this is necessarily catastrophic. A lot depends on the language and framework in question, and also the driver of the change.
- I think it's that plus the fact it's easy to just generate ever more code. Solutions scale in every dimension until they hit a limit where it's not feasible to go further. If AI tools will allow you to write a project with a million or 10 million lines of code, you can bet it will eventually happen. Who's ever gonna fix that?
Folks are reviewing the code, but the standard shape of a review is a PR. This diff assumes you have an underlying knowledge of the system, one that is most realistically gained by having written the code. Could you “just remember” every diff you’ve seen? Maybe, but I don’t think it’s realistic; we learn far better from doing than from reading.