logoalt Hacker News

krannertoday at 1:50 PM1 replyview on HN

You can't test everything. The input space may be infinite. The app may feel janky. You can't even be sure you're testing all that can be tested.

The code may seem to work functionally on day 1. Will it continue to seem to work on day 30? Most often it doesn't.

And in my experience, the chances of LLMs fucking up are hardly very very low. Maybe it's a skill issue on my part, but it's also the case that the spec is sometimes discovered as the app is being built. I'm sure this is not the case if you're essentially summoning up code that exists in the test set, even if the LLM has to port it from another language, and they can be useful in parts here and there. But turning the controls over to the infinite monkey machine has not worked out for me so far.


Replies

CuriouslyCtoday at 3:22 PM

If you care about performance, test it (stress test).

If you care about security, test it (red teaming).

If you care about maintainability, test it (advanced code analysis)

Your eyeballs are super fallible, this is why bad engineers exist. Get rigorous.

show 1 reply