For a senior engineer, some very odd takes here:
"Our ability to zoom in and implement code is now obsolete Even with SOTA LLMs like Opus 4.5 this is downright untrue. Many, many logical, strategic, architectural, and low level code mistakes are still happening. And given context window limitations of LLMs (even with hacks like subagents to work around this) big picture long-term thinking about code design, structure, extensibility, etc. is very tricky to do right."
If you can't see this, I have to seriously question your competence as an engineer in the first place tbh.
"We already do this today with human-written code. I review some code very closely, and other code less-so. Sometimes I rely on a combination of tests, familiarity of a well-known author, and a quick glance at the code to before saying "sure, seems fine" and pressing the green button. I might also ask 'Have you thought of X' and see what they say.
Trusting code without reading all of it isn't new, we're just now in a state where we need to review 10x more code, and so we need to get much better at establishing confidence that something works without paying human attention all the time.
We can augment our ability to write code with AI. We can augment our ability to review code with AI too."
Later he goes onto suggest that confidence is built via TDD. Problem is... if the AI is generating both code and tests, I've seen time and time again both in internal projects and OSS projects how major assumptions are incorrect, mistakes compound, etc.
> Our ability to zoom in and implement code is now obsolete Even with SOTA LLMs like Opus 4.5 this is downright untrue. Many, many logical, strategic, architectural, and low level code mistakes are still happening. And given context window limitations of LLMs (even with hacks like subagents to work around this) big picture long-term thinking about code design, structure, extensibility, etc. is very tricky to do right.
> If you can't see this, I have to seriously question your competence as an engineer in the first place tbh.
I can't agree more strongly. I work with a number of folks who say concerning things along the lines of what you describe above (or just slightly less strong). The trust in a system that is not fully trustworthy is really shocking, but it only seems to come from a particular kind of person. It's hard to describe, but I'd describe it as: people that are less concerned with the contents of the code versus the behaviour of the program. It's a strange dichotomy, and surprising every time.
I mean, if you don't get the economics of a reasonably factored codebase vs one that's full of hacks and architecturally terrible compromises - you're in for a VERY bad time. Perhaps even a company-ending bad time. I've seen that happen in the old days, and I expect we're in the midst of seeing a giant wave of failures due to unsustainably maintained codebases. But we probably won't be able to tell, startups have been mostly failing the entire time.
Its sad we have gone from making the software as good and efficient as possible to "good enough, ship it". This is the main reason AI agents are successful.
LLMs also try and find short cuts to get the task done, for example I wrote some code (typescript) for work that had a lot of lint errors (I created a pretty strict rule set)
And I asked codex to fix them for me, first attempt was to add comments to disable the rules for the whole file and just mark everything as any.
Second attempt was to disable the rules in the eslint config.
It does the same with tests it will happily create a work around to avoid the issue rather than fix the issue.