A general limitation of this approach is that it is only as good as your validator, and there's nothing easier to validate than a test case that creates, say, an AddressSanitizer use-after-free. For subtler issues will we have to more specific validators or will the LLM become better at coming up with other dangerous conditions it will verify? We'll see.