logoalt Hacker News

EE84M3ilast Wednesday at 12:09 AM1 replyview on HN

Would be curious to hear your hypothesis on what's the remaining 10-20% that might be out of reach? Business logic bugs?


Replies

tptaceklast Wednesday at 12:17 AM

Honestly I'm just trying to be nice about it. I don't know that I can tell you a story about the 90% ceiling that makes any sense, especially since you can task 3 different high-caliber teams of senior software security people on an app and get 3 different (overlapping, but different) sets of vulnerabilities back. By the end of 2027, if you did a triangle test, 2:1 agents/humans or vice/versa, I don't think you'd be able to distinguish.

Just registering the prediction.