That's definitely a pattern people are already starting to have good results from - using multiple "agents" (aka multiple system prompts) where one of them is a security reviewer that audits for problems and files issues for other coding agents to then fix.
I don't think this worked at all well six months ago. GPT-5.2 and Opus 4.5 might just be good enough for this pattern to start being effective.
My current dark factory stack is using a Cyber Elon [0] at CEO with a dev team consisting of Gilfoyle, 2x Mr Robots, and Pickle Rick, with Alan Turing as dev manager, easily 5x'd my output in raw performance metrics with this, and considering I had already easily achieved a 10x over baseline dev performance using vanilla agents and other mainstream AI techniques. Whenever people say AI is just glorified auto complete I know they haven't been using the latest model versions.
[0] Basically an immortal version of ELon musk with his mind fused cybernetically with Grok AI