logoalt Hacker News

tptaceklast Wednesday at 12:17 AM1 replyview on HN

Honestly I'm just trying to be nice about it. I don't know that I can tell you a story about the 90% ceiling that makes any sense, especially since you can task 3 different high-caliber teams of senior software security people on an app and get 3 different (overlapping, but different) sets of vulnerabilities back. By the end of 2027, if you did a triangle test, 2:1 agents/humans or vice/versa, I don't think you'd be able to distinguish.

Just registering the prediction.


Replies

karlmdavislast Wednesday at 2:06 AM

I would take the other side of that bet.

  # if >10 then was_created_by_agent = true
  $ grep -oP '\p{Emoji}' vulns.md | wc -l
show 1 reply