One of the things I'm always looking at with new models released is long context performance, and based on the system card it seems like they've cracked it:
GraphWalks BFS 256K-1M
Mythos Opus GPT5.4
80.0% 38.7% 21.4%this seems to be similar to gpt-pro, they just have a very large attention window (which is why it's so expensive to run) true attention window of most models is 8096 tokens.
[flagged]
Data source:
https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...
(Search for “graphwalk”.)
If true, the SWE bench performance looks like a major upgrade.