Excellent observation. I get so frustrated every time I hear the "we have test-suites and can test deterministically" argument. Have we learned absolutely nothing from the last 40 years of computer science? Testing does not prove the absence of bugs.
Don't worry, the LLM also makes the tests. /s