Sounds like antirez, simonw, et al are still advocating reviewing the code output of these agents for now. But presumably soon (within months?) the agents will be good enough such that line-by-line review will no longer be necessary, or humanly possible as we crank the agents up to 11.
But then how will we review each PR enough to have confidence in it?
How will we understand the overall codebase too after it gets much bigger?
Are there any better tools here other than just asking LLMs to summarize code, or flag risky code... any good "code reader" tools (like code editors but focused on this reading task)?
We will review fully until they reach superhuman perfection.