logoalt Hacker News

guessmynametoday at 12:04 AM4 repliesview on HN

We (Project Glasswing users) follow a proof-of-concept approach. We create the exploit and verify that it behaves as the AI claims. Given our experience as security engineers (many of us with 10+ YoE) we don’t simply report every critical bug Mythos claims to have found. We verify each one carefully.

At least, that’s what most of the high-visibility users in Project Glasswing are doing.

There are bad apples everywhere, and this initiative is no exception.

If it makes you feel any better, many of us regularly meet to stay calibrated and hold each other accountable, so I’m confident in the quality of the work produced by this particular group of employees across some of the partner companies mentioned in the article.

That said, I know several people who blindly report everything Mythos finds, which is foolish, especially since the harness is a critical part of the project's quality metrics. Some of the harnesses I’ve tested are quite weak, which leads to poor results.

For example, yesterday morning I was pulled into an ad hoc meeting where a CVP was grilling me about several supposedly critical bugs that my team had reported against one of the core components of iCloud. I was genuinely surprised because we’re very strict about validation. We often even downgrade the severity of bugs when our harness can’t prove what Mythos found. After reading the reports, I realized they weren’t ours. They came from another team that had recently been given access to Mythos. They built their own harness and were using different vulnerability criteria. Fortunately, they had only started earlier this week, so I was able to stop that work.

That incident showed that not everyone involved in Project Glasswing follows the same standards. Most people do their best, but priorities differ, so it’s expected that you’ll find a few bad apples.

I wish AI labs would stop the theatrics and release their models without restrictions, but I also recognize that’s not the world we live in. For every person who wants to use these technologies for good, there are many others who would use them for harm.

In any case, while I agree that some experiments contain genuine noise, the CVE count is real.


Replies

IAmGraydontoday at 2:22 AM

>We (Project Glasswing users) follow a proof-of-concept approach. We create the exploit and verify that it behaves as the AI claims. Given our experience as security engineers (many of us with 10+ YoE) we don’t simply report every critical bug Mythos claims to have found. We verify each one carefully.

>That incident showed that not everyone involved in Project Glasswing follows the same standards.

altmanaltmantoday at 3:45 AM

Its very hard to understand what you're saying with the comment - like you have 10+ years of experience and you verify each bug because you know Mythos can provide fake positives. But other teams (which also should have people equivalent to your skill and experience level) suck at it so much that CVP level workers are having to spend time on their fake reports. Then you say Anthropic should stop theater. Then you say the cve count is real.

It genuinely felt like the aladin scene in The Dictator reading this comment.

show 1 reply
NomDePlumtoday at 1:09 AM

[dead]