> We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the re...

epistasis • today at 5:16 PM • 4 replies • view on HN

> We took the specific vulnerabilities Anthropic showcases in their announcement, isolated the relevant code, and ran them through small, cheap, open-weights models. Those models recovered much of the same analysis. Eight out of eight models detected Mythos's flagship FreeBSD exploit, including one with only 3.6 billion active parameters costing $0.11 per million tokens.

Impressive, and very valuable work, but isolating the relevant code changes the situation so much that I'm not sure it's much of the same use case.

Being able to dump an entire code base and have the model scan it is they type of situation where it opens up vulnerability scans to an entirely larger class of people.

Replies

elicash • today at 5:25 PM

This is from the first of the caveats that they list:

> Scoped context: Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior"). A real autonomous discovery pipeline starts from a full codebase with no hints. The models' performance here is an upper bound on what they'd achieve in a fully autonomous scan. That said, a well-designed scaffold naturally produces this kind of scoped context through its targeting and iterative prompting stages, which is exactly what both AISLE's and Anthropic's systems do.

That's why their point is what the subheadline says, that the moat is the system, not the model.

Everybody so far here seems to be misunderstanding the point they are making.

➕ show 6 replies

loire280 • today at 5:27 PM

> Anthropic's own scaffold is described in their technical post: launch a container, prompt the model to scan files, let it hypothesize and test, use ASan as a crash oracle, rank files by attack surface, run validation. That is very close to the kind of system we and others in the field have built, and we've demonstrated it with multiple model families, achieving our best results with models that are not Anthropic's. The value lies in the targeting, the iterative deepening, the validation, the triage, the maintainer trust. The public evidence so far does not suggest that these workflows must be coupled to one specific frontier model.

The argument in the article is that the framework to run and analyze the software being tested is doing most of the work in Anthropic's experiment, and that you can get similar results from other models when used in the same way.

➕ show 1 reply

Jcampuzano2 • today at 5:33 PM

The thing is with smaller cheaper models it is very possible to simply take every file in a codebase, and prompt it asking for it to find vulnerabilities.

You could even isolate it down to every function and create a harness that provides it a chain of where and how the function is used and repeat this for every single function in a codebase.

For some very large codebases this would be unreasonable, but many of the companies making these larger models do realistically have the compute available to run a model on every single function in most codebases.

You have the harness run this many times per file/function, and then find ones that are consistently/on average pointed as as possible vulnerability vectors, and then pass those on to a larger model to inspect deeper and repeat.

Most of the work here wouldn't be the model, it'd be the harness which is part of what the article alludes to.

➕ show 1 reply

odie5533 • today at 5:23 PM

Isn't the difference just harness then? I can write a harness that chunks code into individual functions or groups of functions and then feed it into a vulnerability analysis agent.

➕ show 1 reply

alt Hacker News

Replies