I think AI is just a massive force multiplier. If your codebase has bad foundation and going in the wrong direction with lots of hacks, it will just write code which mirrors the existing style... And you get exactly was OP is suggesting.
If however, your code foundations are good and highly consistent and never allow hacks, then the AI will maintain that clean style and it becomes shockingly good; in this case, the prompting barely even matters. The code foundation is everything.
But I understand why a lot of people are still having a poor experience. Most codebases are bad. They work (within very rigid constraints, in very specific environments) but they're unmaintainable and very difficult to extend; require hacks on top of hacks. Each new feature essentially requires a minor or major refactoring; requiring more and more scattered code changes as everything is interdependent (tight coupling, low cohesion). Productivity just grinds to a slow crawl and you need 100 engineers to do what previously could have been done with just 1. This is not a new effect. It's just much more obvious now with AI.
I've been saying this for years but I think too few engineers had actually built complex projects on their own to understand this effect. There's a parallel with building architecture; you are constrained by the foundation of the building. If you designed the foundation for a regular single storey house, you can't change your mind half-way through the construction process to build a 20-storey skyscraper. That said, if your foundation is good enough to support a 100 storey skyscraper, then you can build almost anything you want on top.
My perspective is if you want to empower people to vibe code, you need to give them really strong foundations to work on top of. There will still be limitations but they'll be able to go much further.
My experience is; the more planning and intelligence goes into the foundation, the less intelligence and planning is required for the actual construction.
The wrinkle is that the AI doesn't have a truly global view, and so it slowly degrades even good structure, especially if run without human feedback and review. But you're right that good structure really helps.
This is what I’ve discovered as well. I’ve been working on refactoring a massive hunk of really poor quality contractor code, and Codex originally made poor and very local fixes/changes.
After rearchitecting the foundations (dumping bootstrap, building easy-to-use form fields, fixing hardcoded role references 1,2,3…, consolidating typescript types, etc.) it makes much better choices without needing specific guidance.
Codex/Claude Code won’t solve all your problems though. You really need to take some time to understand the codebase and fixing the core abstractions before you set it loose. Otherwise, it just stacks garbage on garbage and gets stuck patching and won’t actually fix the core issues unless instructed.
A tangent, I keep hearing this good base, but I've never seen one, not in the real world.
No projects, unless it's only you working on it, only yourself as the client, and is so rigid in it's scope, it's frankly useless, will have this mythical base. Over time the needs change, there's no sticking to the plan. Often it's a change that requires rethinking a major part. What we loathe as tight coupling was just efficient code with the original requirements. Then it becomes a time/opportunity cost vs quality loss comparison. Time and opportunity always wins. Why?
Because we live in a world run by humans, who are messy and never sticks to the plan. Our real world systems (bureaucracy , government process, the list goes on) are never fully automated and always leaves gaps for humans to intervene. There's always a special case, an exception.
Perfectly architected code vs code that does the thing have no real world difference. Long term maintainability? Your code doesn't run in a vaccum, it depends on other things, it's output is depended on by other things. Change is real, entropy is real. Even you yourself, you perfect programmer who writes perfect code will succumb eventually and think back on all this with regret. Because you yourself had to choose between time/opportunity vs your ideals and you chose wrong.
Thanks for reading my blog-in-hn comment.
And what if the foundation was made by the AI itself? What’s the excuse then?
Can the AI help with refactoring a poor codebase? Can it at least provide good suggestions for improvement if asked to broadly survey a design that happens to be substandard? Most codebases are quite bad as you say, so this is a rather critical area.
When you say multiplier, what kind of number are you talking about. Like what multiple of features shipped that don't require immediate fixes have you experienced.
my exact experience, and AI is especially fragile when you are starting new project from scratch.
Right know I'm building NNTP client for macOS (with AppKit), because why not, and initially I had to very carefully plan and prompt what AI has to do, otherwise it would go insane (integration tests are must).
Right know I have read-only mode ready and its very easy to build stuff on top of it.
Also, I had to provide a lot of SKILLS to GPT5.3
how do you know there is such thing as good code foundations, and how do you know you have it? this is an argument from ego
socketcluster nailed it. I've seen this firsthand — the same agent produces clean output when the codebase has typed specs and a manifest, and produces garbage when it's navigating tribal knowledge. The hard part was always there. Agents just can't hide it like humans can.
[dead]
I agree completely.
I just did my first “AI native coding project”. Both because for now I haven’t run into any quotas using Codex CLI with my $20/month ChatGPT subscription and the company just gave everyone an $800/month Claude allowance.
Before I even started the implementation I:
1. Put the initial sales contract with the business requirements.
2. Notes I got from talking to sales
3. The transcript of the initial discovery calls
4. My design diagrams that were well labeled (cloud architecture and what each lambda does)
5. The transcript of the design review and my explanations and answering questions.
6. My ChatGPT assisted breakdown of the Epics/stories and tasks I had to do for the PMO
I then told ChatGPT to give a detailed breakdown of everything during the session as Markdown
That was the start of my AGENTS.md file.
While working through everything task by task and having Codex/Claude code do the coding, I told it to update a separate md file with what it did and when I told it to do something differently and why.
Any developer coming in after me will have complete context of the project from the first git init and they and the agents will know the why behind every decision that was made.
Can you say that about any project that was done before GenAI?