Using AI to write better code more slowly

307 points • by signa11 • yesterday at 11:16 PM • 122 comments • view on HN

Comments

I've hit this point with AI where it's not a simple process, but a long drawn out back and forth.

I'll use AI to design the implementation of a medium sized, cross cutting feature. Review all the details, maybe iterate on just that. Then implement with Claude 4.7 Max - which runs slower, but does a better job. Then review the implementation, then have Codex GPT 5.5 xhigh fast review it - which almost always finds corner cases. Have Claude fix those - Claude is better at writing intuitive maintainable code versus Codex overengineered/shortcut filled code. (Codex is better at finding/fixing bugs and doing reviews - it's annoyingly pedantic)

Then repeat with fresh Claude/Codex instances having them both review the current staged changes and getting feedback, handling the feedback. Then covering it in tests. I mean overall I still implement the feature faster than coding it manually, but I spend a majority of the time going back and forth with reviews, handling corner cases and at the finish end up with what I feel a really solid implementation of whatever feature I'm working on. The v1 feature feels more like a v3 given the amount of iteration it already went through.

➕ show 14 replies

etothet • today at 4:30 AM

“A lot of people seem convinced that the point of AI coding is to write low-quality code as fast as possible.”

A lot of people think a lot of things, but I don’t think the majority of people think the point of using LLMs is so they can produce low-quality code. Do they produce low-quality code sometimes or often? Of course. But they also produce high-quality code very often. And sometimes they just a “fine” job.

One of the promises - and there are plenty of cases where it’s met and where it falls drastically short - is that agentic coding tools can help us code faster that is just as good or better than what a human can. One of the other big ideal payoffs is that agentic coding can allow non-programmers to create things that previously required programmers to create.

We can debate as to how successful we’ve been toward the two goals above, but I think it’s misguided to say that the majority of people think LLMs should produce lower quality code.

➕ show 1 reply

crabmusket • today at 12:46 AM

The linked article about getting LLMs to critique each others' code review[1], the magpie tool[2], and also this recent article from Cloudflare about their code review stack[3] are all quite compelling.

I'm fairly AI-skeptical not on grounds of "do they work" but "are they good for the world". I feel that getting AIs to do this kind of review work is a rare case that doesn't outsource thinking and deskill workers. It doesn't trigger the same alarm bells as having the AI write the code (including having the AI fix the issues it discovers). That's setting aside environmental and other ethical concerns, which are still significant to me.

I have been impressed by the recent quality of AI code reviews*, but the experience of interacting with 3 separate AI reviewers via GitHub PRs is pretty terrible. Having more local-oriented and jj/rebase-aware review rounds would be great.

*context: fairly large PHP/Laravel backend and Vue frontend

[1]: https://milvus.io/blog/ai-code-review-gets-better-when-model...

[2]: https://github.com/liliu-z/magpie

[3]: https://blog.cloudflare.com/ai-code-review/

justinlivi • today at 12:47 AM

I find myself spending on average more time in LLM review/resolution loops than it would take for me to write the code by hand. Partially because once I'm in the flow I write very very quickly and the code pours out sometimes faster than I can write. But also because the LLM code on the first few tries is generally really really bad. What I find interesting though is that spending the time to personally review and direct the LLM through several iterations of review and revision on average results in higher quality code written in about the same time as I would have written it. This might be particular to me, but seeing several interations of someone else's code helps me better understand holistically my objective as opposed to whatever happens to come out of my flow-state consciousness.

➕ show 1 reply

alexwwang • today at 4:31 AM

So I am figuring out how to let LLM write code automatically as long as I clarify the requirements. I have made a set of skills to deal with this and it called tdd-pipeline. I eat this dog food and by several rounds of iterations to fix bugs, it works better and better. Now I feel much relax while it is working.

I open sourced it on GitHub, you may search alexwwang/tdd-pipeline to find it if you are interested in it.

TACIXAT • today at 1:05 AM

This article doesn't address writing code with AI, just code review. My issue with agentic coding is that I make numerous micro-architectural decisions while programming. I almost never have a full spec up front and develop one as I consider what I am writing.

When using Claude Code or Codex, that is all gone. Claude Code is extremely eager to reach the end goal to the point that it feels like a fever dream to write code with it. In the end, I have low confidence about edge cases and fit into the project's architectural and design goals.

On top of that, I enjoy programming, reverse engineering, etc. and I feel that the LLMs, while able to solve some problems or deliver some features, take that fun away. I'm trying really hard to find a workflow with them that I'm confident in, but I fear that workflow is just chat, search, and being a rubber duck for my thoughts.

smusamashah • today at 12:42 AM

Title of this article suggested more depth and I was expecting actual code examples. But it is like other opinion pieces. It suggests a prompt (ask AI to find bugs) that works for the author advising everyone to do it that way.

I use these tools at both work and for personal side projects and I was expecting to watch and learn. But these opinion pieces without examples are way too many now.

➕ show 1 reply

hintymad • today at 4:07 AM

On the other hand, some companies are pushing the idea that engineers should build robust self-evaluating agent pipeline with human feedback in the loop so that agents write most of the production code. Creao's CEO said that they rearchitected their entire production systems in two weeks this January. He also claimed that their agents implemented so many features so fast that they had to wait their business development to catch up.

I wonder how we can evaluate these two options: using AI to 100X the output versus using AI to advance one's craft.

In the meantime, the productivity gain of AI is real. Case in point, An engineering org of Snowflake has met all its OKRs ahead of time in the first quarter for the time in the company's history. It had never happened, and usually meeting 70% of the planned OKR would be considered an achievement. I can imagine the stress of the engineers when they see such outcome.

themanmaran • today at 3:00 AM

As I read this, I'm also working through a pretty dense feature that took a fair bit of iteration. The end result is actually significantly less code than it was about halfway through. And I was wondering if the AI actually helped me at all, since surely I could have written the code in the same time it took to iterate

But! Because of AI I was able to rapidly hack out like 4 variants of this feature that I didn't like. And felt comfortable throwing them away just as quick.

➕ show 2 replies

vessenes • today at 12:58 AM

One thing that's been interesting to me over the last few years is charting the edge of my coding laziness. As a coder, I'm lazy about boilerplate code -- I hate writing it, I hate maintaining it, etc. And so I design and architect (or used to) around that preference. Sometimes that's smart, sometimes that's not. But it was my preference, and I avoided something that was hard for me to do.

When LLMs started being somewhat useful for coding a few years ago, and I found they were in fact great at boilerplate, in fact pretty much only good at boilerplate ca 2023 or so, it got me thinking about all the accommodations we make in design and systems architecture that are sort of tacitly understanding who we're working with and their strengths and weaknesses.

The modern models have their own very different strengths and weaknesses compared to humans, and deploying them is a really interesting exercise of different architectural and engineering skills. I've enjoyed it, and hope I continue to.

➕ show 1 reply

abhis3798 • today at 4:08 AM

Love this. I use a similar "ralph-loop" approach that starts with an approved plan and then hand it off to a coordinator which does it across 2 sessions (build and review for simplicity), with each session getting its own model.

boringstack • today at 3:43 AM

Optimizing for code quality over raw output speed is a great approach. The time 'lost' writing it slowly is easily made up by the time saved on debugging and maintenance later.

ciconia • today at 4:08 AM

To me the blocker with using coding agents is having to rely on a paid external service. Are there any local models that are good enough to be used for coding?

➕ show 1 reply

sreekanth850 • today at 3:54 AM

100% agree after building a production ready platform ground up. it took 3-4 months but without AI i would never had been done with a team of 3. one thing to note that AI is weak at Front end. So, we did the entire front end without AI.

Waterluvian • today at 2:50 AM

I think my current conclusion is that AI makes <foo> more important than ever.

I’m not exactly sure what <foo> is but I feel it. I think it’s quality and authenticity and craftsmanship. That difference between an expensive tool and a cheap one that you can’t easily describe but you just know it.

Is there a word for this? I bet the Japanese or Germans have a word for this.

I use AI a lot now. But I also do it in small steps. It isn’t a craftsman, but it can help me be one.

➕ show 2 replies

kiba • today at 12:29 AM

I used LLM as a tutor to tackle unfamiliar terrain. That is, I write code that I know very likely doesn't work but is the best code that I could have written. The LLM will happily tirelessly show me what I did wrong and what the correct code actually look like. Then, at the end of it, I got code that running. That's a tight feedback loop.

It's still very slow. It took me two hours to write code that generate JSON data and then to write a web page that displays a knowledge graph.

One thing you have to be aware is that the LLM will happily generate code for you and you have to discipline it from time to time. I notice that my reading comprehension begins to suffer if I don't write the code myself and have to understand what the LLM wrote for me as opposed to the LLM correcting where I went wrong.

One thing I would like to try with an LLM is understanding a large and complex existing codebase like OpenSCAD that doesn't leverage my existing skillset(high level programming languages with OpenSCAD as primary language in the past year). That has always been a barrier to contribution for me.

syntaxing • today at 12:56 AM

Hot take, barring from special edge cases, I find using dumber models (like local Qwen 3.6) to be the best balance. Smart enough to do stuff but dumb enough where I don’t trust it and verify what it’s doing rather than letting it do the third whole code base refactoring of the day. Also forces me to know my code base and ask very descriptive tasks rather than go “something is wrong, fix it”.

ElenaDaibunny • today at 3:06 AM

The bug-finding use case alone makes this worth it.

reactordev • today at 2:08 AM

This is the approach I take, with many guardrails and nested CLAUDE.md's to keep things sane.

ptlan_asnh • today at 12:42 AM

How profound! Talking points are changing from "vibe coding delivers bug free software" to "slow down and enjoy the AI".

Great how the promoters are mirroring the current anti-AI sentiment. The next step is canceling all subscriptions and not using AI at all. Maybe your mind will work again.

➕ show 1 reply

EFLKumo • today at 1:30 AM

https://news.ycombinator.com/item?id=48246232

This reminds me the article above. Now people have diverse ideas on agentic coding. Some suggest human-in-the-loop while others suggest giving a detailed specification and let the agent run freely; some suggest leveraging LLM's high productivity and here we get an opinion that LLM can actually slowly write good code.

It's happy to see opinions that are more practical and variant emerging, turning LLM into literally a tool instead of something to be hated or hyped.

In my own practice, I find LLMs (SOTA ones) good at medium-level tasks, those needed to reason and plan for a while. However, the design taste on architecture is unexpectedly disgusting. Sometimes writing interfaces myself and asking LLMs to fill in implementations, alongside context-completing tools like context7, deepwiki, docs.rs MCPs, etc. and giving a escape hatch (e.g. encouraging it to use the AskUser tool in Claude Code), may be considered my best practice.

knuckle • today at 2:00 AM

Stop being reasonable! This is a hypecycle!

tonymet • today at 2:43 AM

Are we overcomplicating AI by approaching top down, so naturally there are trillions of variations and too many ways to fail? Supervising a component-level scope, with emphasis on quality control (regression, perf testing, benchmarking, etc), seems to produce great work.

efitz • today at 1:11 AM

Great article and right on point.

anuis258 • today at 3:53 AM

hmmm

npollock • today at 12:43 AM

learn by considering critique

slopinthebag • today at 12:46 AM

I use cheaper models (Deepseek is king, but GLM and Kimi as well) and do the planning myself. I often start a task myself, write some code to get the LLM on the right track, and then have it complete parts of the implementation that are kind of boring or repetitive. LLM's are just next token predictors, I don't mean that in a demeaning way, but I've found if I can get the LLM started on the right track with my own code, it completes what I want. Having the LLM write code just from a spec ends up with poor quality slop in my experience.

I'm not 100x'ing my output like some people claim, but using it as a augmentation rather than delegating my work to it results in better code, and I don't lose context / control over my codebases. I really have read 100% of the code, because the LLM is generating smaller pieces around and inside my own written code. Works well enough for me, and open models are already both cheap enough and good enough for this workflow. This is why the big companies are so desperate to push full-on agentic hands-off workflows and developer replacement - that's the only way they won't go bankrupt.

➕ show 1 reply

nibblecid • today at 3:48 AM

[flagged]

xuanlin314 • today at 3:36 AM

[flagged]

chengyongru • today at 2:01 AM

[flagged]

zhxiaoliang • today at 1:03 AM

[dead]

jdw64 • today at 12:51 AM

[dead]

Lapsa • today at 4:05 AM

[dead]

seblon • today at 12:36 AM

I want to mention, Claude code has a command /code-review. I find it quiet useful.

ai_fry_ur_brain • today at 1:47 AM

Just dont use it lol, it does nothing you cant do by yourself. You're nerfing parts of your brain by relying on it.

➕ show 3 replies

alasano • today at 12:41 AM

Instead of using a skill and having the agent own the flow for this, I've been building an external orchestrator that handles the process.

By default it uses pi agent core + pi ai (from the excellent pi coding agent) as a multi model runtime but also supports a Claude Agent SDK runtime.

I can have an implementation and review process of an OpenSpec change run anywhere from 2 hours to 24+ hours going through review/fix/verification rounds automatically until the implementation matches the spec and any additional reviewers are done finding issues after the fix rounds.

it's going to be fully open sourced in the next two weeks and fully free to use

https://engine.build

➕ show 1 reply

alt Hacker News

Using AI to write better code more slowly

Comments