I agree with you about scanners (we banned them at Matasano), but not about the ceiling for agents. ...

tptacek • last Tuesday at 11:28 PM • 4 replies • view on HN

I agree with you about scanners (we banned them at Matasano), but not about the ceiling for agents. Having written agent loops for somewhat similar "surface and contextualize hypotheses from large volumes of telemetry" problems, and, of course, having delivered hundreds of application pentests: I think 80-90% of all the findings in a web pentest report, and functionally all of the findings in a netpen report, are within 12-18 months reach of agent developers.

Replies

KurSix • last Wednesday at 8:42 AM

I agree with the prediction. The key driver here isn't even model intelligence, but horizontal scaling. A human pentester is constrained by time and attention, whereas an agent can spin up 1,000 parallel sub-agents to test every wild hypothesis and every API parameter for every conceivable injection. Even if the success rate of a single agent attempt is lower than a human's, the sheer volume of attempts more than compensates for it.

➕ show 1 reply

torginus • last Wednesday at 10:17 AM

I wonder how the baseline for 100% is established - are there (security relevant) software that you'd say are essentially free of vulnerabilities?

➕ show 1 reply

EE84M3i • last Wednesday at 12:09 AM

Would be curious to hear your hypothesis on what's the remaining 10-20% that might be out of reach? Business logic bugs?

➕ show 1 reply

nullcathedral • last Tuesday at 11:45 PM

I'd say I agree with you there for the low-hanging fruit. The deep research (there's an image filter here but we can bypass it by knowing some obscure corner of the SVG spec) is where they still fall over and need hand holding by pointing them at the browser rendering stack, specs, etc

alt Hacker News

Replies