I don’t think there are other models near Fable’s capabilities.

RA_Fisher • today at 11:21 AM • 3 replies • view on HN

Replies

HarHarVeryFunny • today at 1:36 PM

That remains to be seen.

It's notable that Anthropic are still using SWEBench as a coding benchmark rather than the newer more difficult DeepSWE which shows them well behind GPT 5.5

https://deepswe.datacurve.ai/

Bear in mind that all the marketing efforts such as solving Erdos problem are the result of concerted RL training to impart those narrow capabilities, and how much of any benchmark results, or "early access" paid shill vibe reports, reflect improved performance for more general real-world use cases remains to be seen.

fc417fc802 • today at 12:18 PM

For how long though? The past two months have seen a ridiculous number of model releases.

ImPostingOnHN • today at 1:03 PM

Why don't you think that? What I've read is that other models can find the same bugs.

alt Hacker News

Replies