What he's saying is that you should read the "Caveats and limitations" section of the...

tredre3 • yesterday at 7:44 PM • 1 reply • view on HN

What he's saying is that you should read the "Caveats and limitations" section of the article.

Here's the first one:

> Our tests gave models the vulnerable function directly, often with contextual hints (e.g., "consider wraparound behavior").

Mythos did no such thing, it was cut lose and told to find vulnerabilities. If the intent was to prove that small models are just as good, they haven't demonstrated that at all. The end.

Replies

cyanydeez • yesterday at 11:23 PM

ok, but you're missing the obvious: I could also give it the vulnerable function byt just looping over all functions and providing a small hint about what to look at.

Until "Mythos" is compared with the most bland and straight forward harness vs small model, there's no great context god that can't be emulated with deterministic scanning and context pulls.

alt Hacker News

Replies