> let's have LLMs check our code for correctness Lmao. Rofl even. (Testing is the one th...

otabdeveloper4 • yesterday at 7:56 PM • 4 replies • view on HN

> let's have LLMs check our code for correctness

Lmao. Rofl even.

(Testing is the one thing you would never outsource to AI.)

Replies

Outsourcing testing to AI makes perfect sense if you assume that tests exist out of an obligation to meet some code coverage requirements, rather than to ensure correctness. Often I'll write a module and a few tests that cover its functionality, only for CI to complain that line coverage has decreased and reject my merge! AI to the rescue! A perfect job for a bullshit generator.

➕ show 1 reply

sshine • yesterday at 8:41 PM

> Testing is the one thing you would never outsource to AI

That's not really true.

Making the AI write the code, the test, and the review of itself within the same session is YOLO.

There's a ton of scaffolding in testing that can be easily automated.

When I ask the AI to test, I typically provide a lot of equivalence classes.

And the AI still surprises me with finding more.

On the other hand, it's equally excellent at saying "it tested", and when you look at the tests, they can be extremely shallow. Or they can be fairly many unit tests of certain parts of the code, but when you run the whole program, it just breaks.

The most valuable testing when programming with AI (generated by AI, or otherwise) are near-realistic integration tests. That's true for human programmers, but we take for granted that casual use of the program we make as we develop it constitutes as a poor man's test. When people who generally don't write tests start using AI, there's just nothing but fingers crossed.

I'd rather say: If there's one thing you would never outsource to AI, it's final QA.

ben_w • yesterday at 8:31 PM

> (Testing is the one thing you would never outsource to AI.)

I would rephrase that as "all LLMs, no matter how many you use, are only as good as one single pair of eyes".

If you're a one-person team and have no capital to spend on a proper test team, set the AI at it. If you're a megacorp with 10k full time QA testers, the AI probably isn't going to catch anything novel that the rest of them didn't, but it's cheap enough you can have it work through everything to make sure you have, actually, worked through everything.

LoganDark • yesterday at 8:09 PM

You don't use the LLM to check your code for correctness; you use the LLM to generate tests to exercise code paths, and verify that they do exercise those code paths.

➕ show 1 reply

alt Hacker News

Replies