Really this is why the LLM needs to be able to write exploits for issues it finds. Of course that leads down a rabbit hole of other issues. But if an exploit works, then that's pretty conclusive evidence.
For a subset of bugs, yes. For some others, not really: I've seen LLMs make bogus assumptions about the threat model (in which case, the exploit works but doesn't demonstrate anything useful) or "cheat" by modifying the code to demonstrate a hallucinated issue.
Frontier models, including Mythos, can greatly streamline bug hunting and exploit developments in the hands of a competent security engineer. In the hands of a person with no security experience, they will still mostly waste your time and money.
For a subset of bugs, yes. For some others, not really: I've seen LLMs make bogus assumptions about the threat model (in which case, the exploit works but doesn't demonstrate anything useful) or "cheat" by modifying the code to demonstrate a hallucinated issue.
Frontier models, including Mythos, can greatly streamline bug hunting and exploit developments in the hands of a competent security engineer. In the hands of a person with no security experience, they will still mostly waste your time and money.