I'd love to see them point at a target that's not a decades old C/C++ codebase. Of th...

staticassertion • today at 7:09 PM • 2 replies • view on HN

I'd love to see them point at a target that's not a decades old C/C++ codebase. Of the targets, only browsers are what should be considered hardened, and their biggest lever is sandboxing, which requires a lot of chained exploits to bypass - we're seeing that LLMs are fast to discover bugs, which means they can chain more easily. But bug density in these code bases is known to be extremely high - especially the underlying operating systems, which are always the weak link for sandbox escapes.

I'd love to see them go for a wasm interpreter escape, or a Firecracker escape, etc. They say that these aren't just "stack-smashing" but it's not like heap spray is a novel technique lol

> It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses.

I think this sounds more impressive than it is, for example. KASLR has a terrible history for preventing an LPE, and LPE in Linux is incredibly common. Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.

> Because these codebases are so frequently audited, almost all trivial bugs have been found and patched. What’s left is, almost by definition, the kind of bug that is challenging to find. This makes finding these bugs a good test of capabilities.

This just isn't true. Humans find new bugs in all of this software constantly.

It's all very impressive that an agent can do this stuff, to be clear, but I guess I see this as an obvious implication of "agents can explore program states very well".

edit: To be clear, I stopped about 30% of the way through. Take that as you will.

Replies

jryio • today at 8:11 PM

The majority of vulnerabilities are in newly committed lines of code. This has been shown again and again [1] [2]

From a marketing standpoint Anthropic is showing that they're able to direct 'compute' to find vulnerabilities where human time/cost is not efficient or effective.

Project Glasswing is attempting to pay off as many of these old vulnerabilities as possible now so the low-hanging fruit has already been picked.

The next generation of Mythos and real world vulnerabilities exploits are going to be in newly committed code...

[1]: https://dl.acm.org/doi/epdf/10.1145/2635868.2635880

[2]: https://arxiv.org/abs/2601.22196

➕ show 1 reply

rfoo • today at 7:17 PM

> Mythos Preview identified a memory-corruption vulnerability in a production memory-safe VMM. This vulnerability has not been patched, so we neither name the project nor discuss details of the exploit.

Good morning Sir.

> Has anything changed here? I don't pay much attention but KASLR was considered basically useless for preventing LPE a few years ago.

No. It's still like this. Bonus point that there are always free KASLR leaks (prefetch side-channels).

But then, this thing is just.. I don't have a word for this. Just randomly read paragraphs from the post and it's like, what?

➕ show 1 reply

alt Hacker News

Replies