If you look at public benchmarks like ExploitBench [1], then you'll see this is mostly a questi...

sigmoid10 • today at 7:13 AM • 0 replies • view on HN

If you look at public benchmarks like ExploitBench [1], then you'll see this is mostly a question of token budget. Once you give it sufficient tokens to burn, GPT 5.5 is roughly as good as Mythos when it comes to finding bugs and building exploits. With some clever auto-prompting to clear stalls, it even beats the base Mythos version. So Mythos' "magic" is not in the model, but in the harness and compute env. That's probably also why they never released it, because Anthropic already struggled heavily to make Opus available to the general public. Releasing Mythos publicly may well be technically impossible for them due to compute constraints.

[1] https://exploitbench.ai

alt Hacker News