logoalt Hacker News

culopatinlast Thursday at 10:09 PM4 repliesview on HN

But are those tests relevant? I tried using LLMs to write tests at work and whenever I review them I end up asking it “Ok great, passes the test, but is the test relevant? Does it test anything useful?” And I get a “Oh yeah, you’re right, this test is pointless”


Replies

manmallast Thursday at 11:07 PM

Keep track of test coverage and ask it to delete tests without lowering coverage by more than let’s say 0.01 percent points. If you have a script that gives it only the test coverage, and a file with all tests including line number ranges, it is more or less a dumb task it can work on for hours, without actually reading the files (which would fill context too quickly).

show 1 reply
tlarkworthyyesterday at 5:57 AM

We fixed this at work by instructing it to maximize coverage with minimal tests, which is closer to our coding style.

elbearyesterday at 7:29 AM

Those tests were written by people. That's why they were confident that what the LLM implemented was correct.

show 1 reply
wahnfriedenlast Thursday at 10:24 PM

Yes

Skill issue... And perhaps the wrong model + harness