One thing I appreciated about this post, unlike a lot of AI-skeptic posts, is that it actually makes a concrete falsifiable prediction; specifically, "LLMs will never manage to deal with large code bases 'autonomously'". So in the future we can look back and see whether it was right.
For my part, I'd give 80% confidence that LLMs will be able to do this within two years, without fundamental architectural changes.
"Deal with" and "autonomously" are doing a lot of heavy lifting there. Cursor already does a pretty good job indexing all the files in a code base in a way that lets it ask questions and get answers pretty quickly. It's just a matter of where you set the goalposts.
« autonomously » what happens when subtle updates that are not bugs but change the meaning of some features that might break the workflow on some other external parts of a client’s system ? It happens all the time and, because it’s really hard to have the whole meaning and business rules written and maintained up to date, an LLM might never be able to grasp some meaning. Maybe if instead of developing code and infrastructures, the whole industry shifts toward only writing impossibly precise spec sheets that make meaning and intent crystal clear then, maybe « autonomously » might be possible to pull off
I don't think that statement is falsifiable until you define "deal with" and "large code bases."
How large? What does "deal" mean here? Autonomously - is that on its own whim, or at the behest of a user?
>LLMs will never manage to deal
time to prove hypothesis: infinity years
That feels like a statement that's far too loosely defined to be meaningful to me.
I work on codebases that you could describe as 'large', and you could describe some of the LLM driven work being done on them as 'autonomous' today.