I believe the codegen must be separated from the runtime. Every time you ask AI for a new task, it must be deployed as a separate app with the least amount of privileges possible, potentially with manual approvals as the app is executing. So essentially you need a workflow engine.