logoalt Hacker News

jonahxlast Tuesday at 10:07 PM4 repliesview on HN

So the stuff that agents would excel at is essentially just the "checklist" part of the job? Check A, B, C, possibly using tools X, Y, Z, possibly multi-step checks but everything still well-defined.

Whereas finding novel exploits would still be the domain of human experts?


Replies

tptaceklast Tuesday at 10:10 PM

I'm bullish on novel exploits too but I'm much less confident in the prediction. I don't think you can do two network pentests and not immediately reach the conclusion that the need for humans to do significant chunks of that work at all is essentially a failure of automation.

With more specificity: I would not be at all surprised if the "industry standard" netpen was 90%+ agent-mediated by the end of this year. But I also think that within the next 2-3 years, that will be true of web application testing as well, which is in a sense a limited (but important and widespread) instance of "novel vulnerability" discovery.

cookiengineerlast Wednesday at 6:38 AM

Well, agents can't discover bypass attacks because they don't have memory. That was what DNCs [1] (Differentiable Neural Computers) tried to accomplish. Correlating scan metrics with analytics is btw a great task for DNCs and what they are good at due to how their (not so precise) memory works. Not so much though at understanding branch logic and their consequences.

However, I currently believe that forensic investigations will change post LLMs, because they're very good at translating arbitrary bytecode, assembly, netasm, intel asm etc syntax to example code (in any language). It doesn't have to be 100% correct in those translations, that's why LLMs can be really helpful for the discovery phase after an incident. Check out the ghidra MCP server which is insane to see real-time [2]

[1] https://github.com/JoergFranke/ADNC

[2] https://github.com/LaurieWired/GhidraMCP

show 2 replies
suriya-ganeshlast Tuesday at 10:46 PM

With exploits, you'll have to go through the rote stuff of checklisting over and over, until you see aberrations across those checklist and connect the dots.

If that part of the job is automated away. I wonder how the talent and skill for finding those exploits will evolve.

socketclusterlast Wednesday at 2:13 AM

They suck at collecting the bounty money because they can't legally own a bank account.