Human code review does not prove correctness. Almost every software service out there contains bugs. Humans have struggled for decades to reliably produce correct software at scale and speed. Overall, humans have a pretty terrible track record of producing bug-free correct code no matter how much they double-check and review their code along the way.
So the solution is to stop doing code reviews and just YOLO-merge everything? After all, everything is fucked already, how much worse could it get?
For the record, there are examples where human code review and design guidelines can lead to very low-bug code. NASA published their internal guidelines for producing safety-critical code[1]. The problem is that the development cost of software when using such processes is too high for most companies, and most companies don't actually produce safety-critical software.
My experience with the vast majority of LLM code submitted to projects I maintain is that it has subtle bugs that I managed to find through fairly cursory human review. The copilot code review feature on GitHub also tends to miss actual bugs and report nonexistent bugs, making it worse than useless. So in my view, the death of the benefits of human code review have been wildly exaggerated.
[1]: https://en.wikipedia.org/wiki/The_Power_of_10:_Rules_for_Dev...