Models aren't deterministic - every time you would try to re-apply you'd likely get different output (without feeding the current code into the re-apply and let it just recommend changes)
If the result is always provably correct it doesn't matter whether or not it's different at the code level. People interested in systems like this believe that the outcome of what the code does is infinity more important than the code itself.
Let's rephrase:
Since nobody involved actually cares whether the code works or not, it doesn't matter whether it's a different wrong thing each time.
> If the result is always provably correct it doesn't matter whether or not it's different at the code level. People interested in systems like this believe that the outcome of what the code does is infinity more important than the code itself.
If the spec is so complete that it covers everything, you might as well write the code.
The benefit of writing a spec and having the LLM code it, is that the LLM will fill in a lot of blanks. And it is this filling in of blanks that is non-deterministic.
Sure, but where are the formal acceptance tests to validate against?
Besides, you can deterministically generate bad code, and not deterministically generate good code.
[dead]
I would be very comfortable with - re-run 100 times with different seeds. If the outcome is the same every time, you're reliably good to go.
That if at the beginning of your sentence is doing a whole lot of work. Indeed, if we could formally and provably (another extremely loaded word) generate good code that'd be one thing, but proving correctness is one of those basically impossible tasks.