logoalt Hacker News

wrstoday at 6:21 AM1 replyview on HN

I thought the point was not that Mythos finds more vulnerabilities, but that it can exploit them much more successfully. I thought the report showed it didn’t find much more than Opus 4.8. (Or did I misread?)


Replies

sigmoid10today at 7:13 AM

If you look at public benchmarks like ExploitBench [1], then you'll see this is mostly a question of token budget. Once you give it sufficient tokens to burn, GPT 5.5 is roughly as good as Mythos when it comes to finding bugs and building exploits. With some clever auto-prompting to clear stalls, it even beats the base Mythos version. So Mythos' "magic" is not in the model, but in the harness and compute env. That's probably also why they never released it, because Anthropic already struggled heavily to make Opus available to the general public. Releasing Mythos publicly may well be technically impossible for them due to compute constraints.

[1] https://exploitbench.ai