How do you manage to coax public production models into developing exploits or otherwise attacking systems? My experience has been extremely mixed, and I can't imagine it boding well for a pentesting tools startup to have end-users face responses like "I'm sorry, but I can't assist you in developing exploits."
Divide the steps into small enough steps so the LLMs don't actually know the big picture of what you're trying to achieve. Better for high-quality responses anyways. Instead of prompting "Find security holes for me to exploit in this other person's project", do "Given this code snippet, is there any potential security issues?"