I've done some pretty incredible things with LLMs. If this were sqlite with its exhaustive test suite... OK, I can see it.
It's hard for me to see this not becoming a pile of slop, but hey, maybe I'm wrong