I think AI is just a massive force multiplier. If your codebase has bad foundation and going in the wrong direction with lots of hacks, it will just write code which mirrors the existing style... And you get exactly was OP is suggesting.
If however, your code foundations are good and highly consistent and never allow hacks, then the AI will maintain that clean style and it becomes shockingly good; in this case, the prompting barely even matters. The code foundation is everything.
But I understand why a lot of people are still having a poor experience. Most codebases are bad. They work (within very rigid constraints, in very specific environments) but they're unmaintainable and very difficult to extend; require hacks on top of hacks. Each new feature essentially requires a minor or major refactoring; requiring more and more scattered code changes as everything is interdependent (tight coupling, low cohesion). Productivity just grinds to a slow crawl and you need 100 engineers to do what previously could have been done with just 1. This is not a new effect. It's just much more obvious now with AI.
I've been saying this for years but I think too few engineers had actually built complex projects on their own to understand this effect. There's a parallel with building architecture; you are constrained by the foundation of the building. If you designed the foundation for a regular single storey house, you can't change your mind half-way through the construction process to build a 20-storey skyscraper. That said, if your foundation is good enough to support a 100 storey skyscraper, then you can build almost anything you want on top.
My perspective is if you want to empower people to vibe code, you need to give them really strong foundations to work on top of. There will still be limitations but they'll be able to go much further.
My experience is; the more planning and intelligence goes into the foundation, the less intelligence and planning is required for the actual construction.
I think it makes the annoying part less annoying?
Also re: "I spent longer arguing with the agent and recovering the file than I would have spent writing the test myself."
In my humble experience arguing with an LLM is a waste of time, and no-one should be spending time recovering files. Just do small changes one at a time, commit when you get something working, and discard your changes and try again if it doesn't.
I don't think AI is a panacea, it's just knowing when it's the right tool for the job and when it isn't.
> Reading and understanding other people's code is much harder than writing code.
I keep seeing this sentiment repeated in discussions around LLM coding, and I'm baffled by it.
For the kind of function that takes me a morning to research and write, it takes me probably 10 or 15 minutes to read and review. It's obviously easier to verify something is correct than come up with the correct thing in the first place.
And obviously, if it took longer to read code than to write it, teams would be spending the majority of their time in code review, but they don't.
So where is this idea coming from?
I'm feeling people are using AI in the wrong way.
Current LLM is best used to generate a string of text that's most statically likely to form a sentence together, so from user's perspective, it's most useful as an alternative to manual search engine to allow user to find quick answers to a simple question, such as "how much soda is needed for baking X unit of Y bread", or "how to print 'Hello World' in a 10 times in a loop in X programming language". Beyond this use case, the result can be unreliable, and this is something to be expected.
Sure, it can also generate long code and even an entire fine-looking project, but it generates it by following a statistical template, that's it.
That's why "the easy part" is easy because the easy problem you try to solve is likely already been solved by someone else on GitHub, so the template is already there. But the hard, domain-specific problem, is less likely to have a publicly-available solution.
I don't think it makes any part harder. What it does do is expose what people have ignored their whole career: the hard part. The last 15 years of software development has been 'human vibe coding'; copy+pasting snippets from SO without understanding them, no planning, constant rearchitecting, shipping code to prod as long as it runs on your laptop. Now that the AI is doing it, suddenly people want to plan their work and enforce tests? Seems like a win-win to me. Even if it slows down development, that would be a win, because the result is enforcement of better quality.
The "marathon of sprints" paradigm is now everywhere and AI is turning it to 120%. I am not sure how many devs can keep sprinting all the time without any rest. AI maybe can help but it tends to go off-rails quickly when not supervised and reading code one did not author is more exhausting than just fixing one's own code.
> On a personal project, I asked an AI agent to add a test to a specific file. The file was 500 lines before the request and 100 lines after. I asked why it deleted all the other content. It said it didn't. Then it said the file didn't exist before. I showed it the git history and it apologised, said it should have checked whether the file existed first.
Ha! Yesterday an agent deleted the plan file after I told it to "forget about it" (as in, leave it alone).
I think this is the wrong mental model. The correct one is:
'AI makes everything easier, but it's a skill in itself, and learning that skill is just as hard as learning any other skill.'
For a more complete understand, you also have to add: 'we're in the ENIAC era of AI. The equivalents of high-level languages and operating systems haven't yet been invented.'
I have no doubt the next few years will birth a "context engineering" academic field, and everything we're doing currently will seem hopelessly primitive.
My mind changed on this after attempting complex projects—with the right structure, the capabilities appear unbounded in practice.
But, of course, there is baked-in mean reversion. Doing the most popular and uncomplicated things is obviously easier. That's just the nature of these models.
Every time somebody writes an article like this without any dates and without saying which model they used, my guess is that they've simply failed to internalize the idea that "AI" is a moving target; nor understood that they saw a capability level from a fleeting moment of time, rather than an Eternal Verity about the Forever Limits of AI.
My experience has been that if you fully embrace vibe coding...you can get some neat stuff accomplished, but the technical debt you accumulate is of such magnitude that you're basically a slave to the machine.
Once the project crosses a couple of thousands of line of code, none of which you've written yourself, it becomes difficult to actually keep up what's happening. Even reviewing can become challenging since you get it all at once, and the LLM-esque coding style can at times be bloated and obnoxious.
I think in the end, with how things are right now, we're going to see the rise of disposable code and software. The models can churn out apps / software which will solve your specific problem, but that's about it. Probably a big risk to all the one-trick pony SaaS companies out there.
> The hard part is investigation, understanding context, validating assumptions, and knowing why a particular approach is the right one for this situation
Yes. Another way to describe it is the valuable part.
AI tools are great at delineating high and low value work.
I think the author answers their own question at the end.
The first 3/4 of the article is "we must be responsible for every line of code in the application, so having the LLM write it is not helping".
The last 1/4 is "we had an urgent problem so we got the LLM to look at the code base and find the solution".
The situation we're moving to is that the LLM owns the code. We don't look at the code. We tell the LLM what is needed, and it writes the code. If there's a bug, we tell the LLM what the bug is, and the LLM fixes it. We're not responsible for every line of code in the application.
It's exactly the same as with a compiler. We don't look at the machine code that the compiler produces. We tell the compiler what we want, using a higher-level abstraction, and the compiler turns that into machine code. We trust compilers to do this error-free, because 50+ years of practice has proven to us that they do this error-free.
We're maybe ~1 year into coding agents. It's not surprising that we don't trust LLMs yet. But we will.
And it's going to be fascinating how this changes the Computer Science. We have interpreted languages because compilers got so good. Presumably we'll get to non-human-readable languages that only LLMs can use. And methods of defining systems to an LLM that are better than plain English.
I'm working on a paper connecting articulatory phonology to soliton physics. Speech gestures survive coarticulatory overlap the same way solitons survive collision. The nonlinear dynamics already in the phonetics literature are structurally identical to soliton equations. Nobody noticed because these fields don't share conferences.
The article's easy/hard distinction is right but the ceiling for "hard" is too low. The actually hard thing AI enables isn't better timezone bug investigation LOL! It's working across disciplinary boundaries no single human can straddle.
The OP’s example of AI writing 500 LOC, then deleting 400, and saying it didn’t… Last time I saw something like that was at least a year ago, or maybe form some weaker models. It seems to me the problem with articles like this is while they sometimes are true at the moment, they’re usually invalidated within weeks.
People need to consider / realize that the vast majority of source code training data is Github, Gitlab, and essentially the huge sea of started, maybe completed, student and open source project. That large body of source code is for the most part unused, untested, and unsuccessful software of unknown quality. That source code is AI's majority training data, and an AI model in training has no idea what is quality software and what is "bad" software. That means the average source code generated by AI not necessarily good software. Considering it is an average of algorithms, it's surprising generated code runs at all. But then again, generating compiling code is actually trainable, so what is generated can receive extra training support. However, that does not improve the quality of the source code training data, just the fact that it will compile.
Yep it is why the work getting over the threshold is just as long as it was without AI.
Someone mentioned it is a force multiplier I don't disagree with this, it is a force multiplier in the mundane and ordinary execution of tasks. Complex ones get harder and hard for it where humans visualize the final result where AI can't. It is predicting from input but it can't know the destination output if the destination isn't part of the input.
Diagnosing difficult bugs has often been considered the "hard part" and coding agents seem quite good at it?
So I'm not sure this is a good rule of thumb. AI is better at doing some things than others, but the boundary is not that simple.
> makes the easy part easier and the hard part harder
That is to say, just like every headline-grabbing programming "innovation" of the last thirty years.
Totally agree on ai assisted coding resulting in randomly changed code. Sometimes it’s subtle and other times entire methods are removed. I have moved back to just using a JetBrains IDE and coping files in to Gemini so that I can limit context. Then I use the IDE to inspect changes in a git diff, regression test everything, and after all that, commit.
Daily agentic user here, and to me the problem here is the very notion of "vibe coding". If you're even thinking in those terms - this idea that never looking at the code has become a goal unto itself - then IMO you're doing LLM-assisted development wrong.
This is very much a hot take, but I believe that Claude Code and its yolo peers are an expensive party trick that gives people who aren't deep into this stuff an artificially negative impression of tools that can absolutely be used in a responsible, hugely productive way.
Seriously, every time I hear anecdotes about CC doing the sorts of things the author describes, I wonder why the hell anyone is expecting more than quick prototypes from an LLM running in a loop with no intervention from an experienced human developer.
Vibe coding is riding your bike really fast with your hands off the handles. It's sort of fun and feels a bit rebellious. But nobody who is really good at cycling is talking about how they've fully transitioned to riding without touching the handles, because that would be completely stupid.
We should feel the same way about vibe coding.
Meanwhile, if you load up Cursor and break your application development into bite sized chunks, and then work through those chunks in a sane order using as many Plan -> Agent -> Debug conversations with Opus 4.5 (Thinking) as needed, you too will obtain the mythical productivity multipliers you keep accusing us of hallucinating.
If coding was always the “easy part,” what was the point of leetcode grinding for interview preparation?
I swear most of the comments on posts like these are no more original than an LLM, and often less so.
> My friend's panel raised a point I keep coming back to: if we sprint to deliver something, the expectation becomes to keep sprinting. Always. Tired engineers miss edge cases, skip tests, ship bugs. More incidents, more pressure, more sprinting. It feeds itself.
Sorry but this is the whole point of software engineering in a company. The aim is to deliver value to customers at a consistent pace.
If a team cannot manage their own burnout or expectations with their stakeholders then this is a weak team.
It has nothing to do with using ai to make you go faster. Ai does not cause this at all.
The pattern matching and absence or real thinking is still strong.
Tried to move some excel generation logic from epplus to closedxml library.
ClosedXml has basically the same API so the conversion was successful. Not a one-shot but relatively easy with a few manual edits.
But closedxml has no batch operations (like apply style to the entire column): the api is there but internal implementation is on cell after cell basis. So if you have 10k rows and 50 columns every style update is a slow operaton.
Naturally, told all about this to codex 5.3 max thinking level. The fucker still succumbed to range updates here and there.
Told it explicitly to make a style cache and reuse styles on cells on same y axis.
5-6 attempts — fucker still tried ranges here and there. Because that is what is usually done.
Not here yet. Maybe in a year. Maybe never.
If the "hard part" is writing a detailed spec for the code you're about to commit to the project, AI can actually help you with that if you tell it to. You just can't skip that part of the work altogether and cede all control to a runaway slop generator.
404
The truth is that it’s lowering the difficulty of work people used to consider hard. Which parts get easier depends on the role, but the change is already here.
A lot of people are lying to themselves. Programming is in the middle of a structural shift, and anyone whose job is to write software is exposed to it. If your self-worth is tied to being good at this, the instinct to minimize what’s happening is understandable. It’s still denial.
The systems improve month to month. That’s observable. Most of the skepticism I see comes from shallow exposure, old models, or secondhand opinions. If your mental model is based on where things were a year ago, you’re arguing with a version that no longer exists.
This isn’t a hype wave. I’m a software engineer. I care about rigor, about taste, about the things engineers like to believe distinguish serious work. I don’t gain from this shift. If anything, it erodes the value of skills I spent years building. That doesn’t change the outcome.
The evidence isn’t online chatter. It’s sitting down and doing the work. Entire applications can be produced this way now. The role changes whether people are ready to admit it or not. Debating the reality of it at this point mostly signals distance from the practice itself.
Some time back, my manager at the time, who shall remain nameless told the group that having AI is like having 10 people work for you ( he actually had a slightly smaller number, but it was said almost word for word like in the article ) with the expectation being set as: 'you should now be able to do 10x as much'.
Needless to say, he was wrong and gently corrected over the course of time. In his defense, his use cases for LLMs at the time were summarizing emails in his email client.. so..eh.. not exactly much to draw realistic experience from.
I hate to say it, but maybe nvidia CEO is actually right for once. We have a 'new smart' coming to our world. The type of a person that can move between worlds of coding, management, projects and CEOing with relative ease and translate between those worlds.
It seems like a big part of the divide is that people who learned software engineering find vibe coding to be unsuitable for any project intended to be in use for more than a few while those who learned coding think vibe coding is the next big thing because they never have to deal with the consequences of the bad code.
AI is at its best when it makes the boring verbose parts easier.
Training is the process of regressing to the mean with respect to the given data. It's no surprise that it wears away sharp corners and inappropriately fills recesses of collective knowledge in the act of its reproduction.
Don't let AI write code for you unless it's something trivial. Instead use it to plan things, high level stuff, discuss architecture, ask it to explain concepts. Use it as a research tool. It's great at that. It's bad at writing code when it needs to be performant or needs to span over multiple files. Especially when it spans over multiple files because that's where it starts hallucinating and introducing abstractions and boilerplate that's not necessary and it just makes your life harder when it comes to debugging.
Imagine if every function you see starts checking for null params. You ask yourself: "when can this be null", right ? So it complicates your mental model about data flow to the point that you lose track of what's actually real in your system. And once you lose track of that it is impossible to reason about your system.
For me AI has replaced searching on stack overflow, google and the 50+ github tabs in my browser. And it's able to answer questions about why some things don't work in the context of my code. Massive win! I am moving much faster because I no longer have to switch context between a browser and my code.
My personal belief is that the people who can harness the power of AI to synthesize loads of information and keep polishing their engineering skills will be the ones who are going to land on their feet after this storm is over. At the end of the day AI is just another tool for us engineers to improve our productivity and if you think about what being an engineer looked like before AI even existed, more than 50% of our time was sifting through google search results, stack overflow, github issues and other people's code. That's now gone and in your IDE, in natural language with code snippets adapted to your specific needs.
as usual the last 20% need 80% and the other 80% need 20% but my god did Ai make my bs corpo easy repeatable shit work like skimming docs writing summaries, skimming jira confluence and so on actually easier and for 90% of bs crud app changes the first draft is also already pretty good tbh I don't write hard/difficult code more then once a week/month.
[dead]
It's pretty difficult to say what it's going to be three months from now. A few months ago Gemini 2.x in IDEA and related IDEs had to be dragged through coding tasks and would create dumb build time errors on its way to making buggy code.
Gemini in Antigravity today is pretty interesting, to the point where it's worth experimenting with vague prompts just to see what it comes up with.
Coding agents are not going to just change coding. They make a lot of detailed product management work obsolete and smaller team sizes will make it imperative to reread the agile manifesto and and discard scrum dogma.
Coding with AI assistants is just a completely different skill that one should not measure from the perspective of comparing it to the way human programmers write code. Mostly everything that we have: programming languages, frameworks, principles of software development in teams, agile/clean code/TDD/DRY and other debatable or well accepted practices — all this exists to overcome limitations of human mind. AI does not have them and have others.
What I found to be useful for complex tasks is to use it as a tool to explore that highly-dimensional space that lies behind the task being solved. It rarely can be described as giving a prompt and coming back for a result. For me it's usually about having winding conversations, writing lists of invariants and partial designs and feeding them back in a loop. Hallucinations and mistakes become a signal that shows whether my understanding of the problem does or does not fit.
I vibe coded a retro emulator and assembler with tests. Prompts were minimal and I got really great results (Gemini 3). I tried vibe coding the tricky proprietary part of an app I worked on a few years ago; highly technical domain (yes vague don’t care to dox myself). Lots of prompting and didn’t get close.
There are literally thousands of retro emulators on github. What I was trying to do had zero examples on GitHub. My take away is obvious as of now. Some stuff is easy some not at all.