logoalt Hacker News

sbinneeyesterday at 9:59 PM0 repliesview on HN

I am puzzled by the frontier code graph. GPT 5.5 doesn’t show any improvement with reasoning efforts. This new benchmark by Cognition seemed to be released with Fable 5’s announcement.

I am not trying to cook a theory here but it generally shows how strong Claude Opus family is. I am not saying that Opus is not powerful but it doesn’t align with my experience of GPT 5.5 and Opus 4.7.

I understand that Fable and Mythos are frontier models that can do protein folding better than task-specialized ones. To be honest, for practical point of view, for day-to-day coding assistance, GPT family looks more reasonable.

(But then my company pays for claude max anyway for token maxxing. So who am I to complain)