logoalt Hacker News

latchkeyyesterday at 11:18 PM1 replyview on HN

> you cant rely on the thing that produced invalid output to validate it's own output

I've been coding an app with the help of AI. At first it created some pretty awful unit tests and then over time, as more tests were created, it got better and better at creating tests. What I noticed was that AI would use the context from the tests to create valid output. When I'd find bugs it created, and have AI fix the bugs (with more tests), it would then do it the right way. So it actually was validating the invalid output because it could rely on other behaviors in the tests to find its own issues.

The project is now at the point that I've pretty much stopped writing the tests myself. I'm sure it isn't perfect, but it feels pretty comprehensive at 693 tests. Feel free to look at the code yourself [0].

[0] https://github.com/OrangeJuiceExtension/OrangeJuice/actions/...


Replies

slopinthebagyesterday at 11:37 PM

I'm not saying you can't do it, I'm just saying it's not sufficient on its own. I run my code through an LLM and it occasionally catches stuff I missed.

show 1 reply