C, Python, and frameworks don't generate all-new code for every task: you're taking advantage of stuff that's thoroughly tested. That simple debugging UI server is probably using some well-tested libraries, which you can reasonably trust to be bug-free (and which can be updated later to fix any bugs, without breaking your code that relies on them). With AI-generated code, this isn't the case.