logoalt Hacker News

PaulDavisThe1styesterday at 3:53 PM4 repliesview on HN

That's not how software development works.

Folks think, they write code, they do their own localized evaluation and testing, then they commit and then the rest of the (down|up)stream process begins.

LLM's skip over the "actually verify that the code I just wrote does what I intended it to" step. Granted, most humans don't do this step as thoroughly and carefully as would be desirable (sometimes through laziness, sometimes because of a belief in (down|up)stream testing processes). But LLM's don't do it at all.


Replies

sally_glanceyesterday at 4:00 PM

They absolutely can do that if you give them the tools. Seeing Claude (I use it with opencode agents) run curl and playwright to verify and then fix it's implementation was a real 'wow' moment for me.

show 1 reply
mapontoseventhsyesterday at 4:03 PM

> LLM's skip over the "actually verify that the code I just wrote does what I intended it to" step.

I'm not sure where this idea comes from. Just instruct it to write and run unit tests and document as it goes. All of the ones I've used will happily do so.

You still have to verify that the unit tests are valid, but that's still far less work than skipping them or writing the code/tests yourself.

show 1 reply
jimmaswellyesterday at 4:35 PM

> actually verify that the code I just wrote does what I intended it to

That's what the author did when they ran it.

adventuredyesterday at 4:33 PM

Claude Opus 4.5 will routinely test its own code before handing it off to you, even with zero instruction to do so.

show 1 reply