> rather than having humans review AI-generated code, I’d much rather have the AI prove to me that the code it has generated is correct. If it can do that, I’ll take AI-generated code over handcrafted code (with all its artisanal bugs) any day!
I wouldn't. An unreadable mess that has been formally verified is worse than a clear easy to understand piece of code that has not.
Code is rarely written from scratch. As long as you want humans to maintain code, readability is crucial. Code is changed magnitudes more often than written initially.
Of course, if you don't want humans to maintain the code then this point is moot. Though he gets to the catch later on: then we need to write (and maintain and debug and reason about) the specification instead. We will just have kicked the can down the road.
You already have this problem whenever you are using a library in any programming language. Unless you are extremely strict, vendor it and read line by line what the library does, you are just trusting that the code that you are using works.
And nothing is stopping the AI from making the unreadable mess more readable in later iterations. It can make it pass the spec first and make it cleaner later. Just like we do!