I think the main limitation is not code validation but assumption verification. When you ask an LLM to write some code based on a few descriptive lines of text, it is, by necessity, making a ton of assumptions. Oddly, none of the LLM's I've seen ask for clarification when multiple assumptions might all be likely. Moreover, from the behavior I've seen, they don't really backtrack to select a new assumption based on further input (I might be wrong here, it's just a feeling).
What you don't specify, it must to assume. And therein lies a huge landscape of possibilities. And since the AI's can't read your mind (yet), its assumptions will probably not precisely match your assumptions unless the task is very limited in scope.
> Oddly, none of the LLM's I've seen ask for clarification when multiple assumptions might all be likely.
It's not odd, they've just been trained to give helpful answers straight away.
If you tell them not to make assumptions and to rather first ask you all their questions together with the assumptions they would make because you want to confirm before they write the code, they'll do that too. I do that all the time, and I'll get a list of like 12 things to confirm/change.
That's the great thing about LLM's -- if you want them to change their behavior, all you need to do is ask.