Project Glasswing: what Mythos showed us

160 points • by Fysi • today at 1:37 PM • 65 comments • view on HN

Comments

What does this mean?

> It's a different kind of tool doing a different kind of work, and that makes a clean apples-to-apples comparison to earlier models difficult.

They claim it’s a different kind of tool and then describe using it the same way you’d use any other model. This really felt way worse than the average Cloudflare blog and really just rehashed the Mythos announcement which had already called out the key parts being chaining and crafting examples.

➕ show 2 replies

sandeepkd • today at 4:30 PM

I was expecting some more concrete numbers and surprises. It just seems like a balanced promotion article probably written using LLM itself.

➕ show 1 reply

xnorswap • today at 3:47 PM

The real question is whether it was Mythos or Opus that wrote this post.

> "Why it matters"

It doesn't, it's a corporate blog, they were rarely written in one-author's voice anyway, but it's interesting to see that even large organisations are outsourcing their blogs to LLMs.

➕ show 6 replies

MattSayar • today at 4:04 PM

> The loudest reaction to Mythos Preview from other security leaders has been about speed - scan faster, patch faster, compress the response cycle. More than one team we have spoken with is now operating under a two-hour SLA from CVE release to patch in production [...] If regression testing takes a day, you cannot get to a two-hour SLA without skipping it, and the bugs you ship when you skip regression testing tend to be worse than the bugs you were trying to patch.

Over time, I wonder if these models will be able to generate more secure code by default by doing this kind of exploitability testing before ever merging their code.

➕ show 1 reply

dataflow • today at 3:41 PM

That's great and all but how severe were the most severe vulnerabilities found? I imagine they don't want to talk about it, but that's really the most interesting and important bit.

➕ show 2 replies

sf_tristanb • today at 3:55 PM

great, but why don't you share real data on how many security vuln it found ? how many were reals, how many weren't ?

➕ show 1 reply

btown • today at 4:57 PM

This is worth a read specifically for this section and the ones following it, re: custom vs. agentic-coding harnesses. https://blog.cloudflare.com/cyber-frontier-models/#why-point...

Claude Code's harness is remarkable for many use cases, particularly with 1M context sizes. But it's also limited when the scale of code or data to read becomes close to that, or exceeds it. The idea that a cluster of actors can work on a shared, structured set of context snippets, and have guidance around what is relevant to them, is an incredibly useful model outside of cybersecurity as well.

jerrythegerbil • today at 5:34 PM

This blog was written by AI.

staticassertion • today at 5:20 PM

> The harder question is what the architecture around the vulnerability should look like. The principle is to make exploitation harder for an attacker even when a bug exists, so that the gap between when a vulnerability is disclosed and when it is patched matters less. That means defenses that sit in front of the application and block the bug from being reached. It means designing the application so that a flaw in one part of the code cannot give an attacker access to other parts. It means being able to roll out a fix to every place the code is running at the same moment, rather than waiting on individual teams to deploy it.

So nothing new then.

hydra-f • today at 3:56 PM

Beside the poorly written post, the vulnerability discovery workflow might actually give good results

perching_aix • today at 4:21 PM

It's nice to see them address the instrumentation side of this.

I expressed some concerns along the same lines in the thread about the Mythos evaluation curl did a few days ago, which sounded a lot like the "passing in the repo and telling it go!" type workflow described in this as dramatically less effective.

Disappointed that the post is very slim on details beyond this however. No hard numbers. Not comparatively, not in isolation. Would have arguably been kinda the point.

yieldcrv • today at 5:15 PM

“Sorry Dave I’m afraid I can’t do that“

I’m a security researcher

“Oh in that case”

unethical_ban • today at 3:57 PM

Interesting for teams looking to implement ai into their deployment process.

I don't think guardrails are useful long term. Assuming we don't see the end of open near-frontier models, it is folly to try to keep models from doing exploit generation. The solution needs to be all software projects writing code under the assumption that hackers will be running LLMs against their code in search of exploits and write secure code accordingly.

➕ show 1 reply

wnevets • today at 3:52 PM

I can't wait to be told that Cloudflare is now part of "The Mythos FUD" campaign.

➕ show 2 replies

krupan • today at 5:57 PM

[dead]

wutwutwat • today at 4:11 PM

Technically speaking CloudFlare is at its core, a security vulnerability itself. World's largest MITM

reducesuffering • today at 4:13 PM

There will be no mea culpa from folks insinuating Mythos is a marketing stunt. Nor will there be every time AI capabilities repeatedly blast through the naive expectations.

alt Hacker News

Project Glasswing: what Mythos showed us

Comments