logoalt Hacker News

joshribakofftoday at 2:05 PM0 repliesview on HN

While nothing fundamentally changes i have found an increased need for tests and their taxonomies — because the LLM can “hack” the tests. So, having more robust tests with more ways to organize and run the tests. For example instead of 200 tests maybe i have 1,200, along with some lightweight tools to run tests in different parts of the test taxonomies.

A more concrete example is maybe you have tests that show you put a highlight on the active item tests that show you don’t put the highlight on the inactive items, but with an llm you might also want to have tests that wait a while and verify the highlight is not flickering on and off overtime (something so absurd you wouldn’t even test for it before AI).

The value of these test is in catching areas of the code where things are drifting towards nonsense because humans aren’t reviewing as thoroughly. I don’t think that you can realistically have 100% data coverage and prevent every single bug and not review the code. It’s just that I found that slightly more tests are warranted if you do want to step back.