It cost me ~$750 to find a tricky privilege escalation bug in a complex codebase where I knew the rough specs but didn't have the exploit. There are certainly still many other bugs like that in the codebase, and it would cost $100k-$1MM to explore the rest of the system that deeply with models at or above the capability of Opus 4.6.
It's definitely possible to do a basic pass for much less (I do this with autopen.dev), but it is still very expensive to exhaustively find the harder vulnerabilities.
How much would it have cost a human to do the same work? The question isn’t how much tokens cost; the question is how much money is saved by using AI to do it.
Compare to the cost when said vulnerabilities are exploited by bad actors in critical systems. Worth it yet?
This is where the Codex and Claude Code Pro/Max plans are excellent. I rarely run into the limits of Codex. If I do, I wait and come back and have it resume once the window has expired.