In pretty much every single HN post on this topic, there are a number of commenters claiming it’s false. Continued quantifiable data like this seems very important at hopefully resolving the ongoing disagreement about the facts.
I've seen plenty of people saying "Mythos isn't all that exceptional, lots of LLMs can find security vulnerabilities" -- and indeed there is some evidence for that; it sounds like Anthropic was taken somewhat by surprise at how easily a simple prompt managed to get Mythos to deliver exploits and didn't distinguish immediately between the effectiveness of Mythos and the effectiveness of the prompt.
But the claim of "LLMs aren't making a difference in vulnerability discovery" has been laughable to anyone who has been reading security advisories for the past 3 months. Just look at the Credits lines.
I've seen plenty of people saying "Mythos isn't all that exceptional, lots of LLMs can find security vulnerabilities" -- and indeed there is some evidence for that; it sounds like Anthropic was taken somewhat by surprise at how easily a simple prompt managed to get Mythos to deliver exploits and didn't distinguish immediately between the effectiveness of Mythos and the effectiveness of the prompt.
But the claim of "LLMs aren't making a difference in vulnerability discovery" has been laughable to anyone who has been reading security advisories for the past 3 months. Just look at the Credits lines.