logoalt Hacker News

artursapekyesterday at 6:02 PM2 repliesview on HN

They claim extreme performance on ExploitBench, which Mythos was touted as being incredible at. https://x.com/OpenAI/status/2070555278576439306


Replies

HarHarVeryFunnyyesterday at 8:10 PM

My guess is that it's same base model as 5.5, but with additional post-training to improve and benchmaxx on a few things like that.

If they really thought it was competitive with Mythos/Fable across the board, then why wouldn't they release a broader set of benchmarks, and why price it day 1 at 1/2 the cost of Fable?

andriy_kovalyesterday at 6:40 PM

On graph, they are still slightly bellow Mythos. Maybe enough to not be prohibited by US government?