> I found my interactions with Fable to be extremely impressive; it made other models, including ...

Supermancho • today at 1:55 PM • 1 reply • view on HN

> I found my interactions with Fable to be extremely impressive; it made other models, including GPT 5.5 and Opus 4.8, feel small and dumb.

> Anthropic models have consistently been top-scoring in BullshitBench[0]

eyeroll I find that Anthropic models feel big and dumber.

I'm interested in code utility and correctness, even if the majority of AI use is not focused on that.

airstrike • today at 3:57 PM

I think this just proves anyone can pick a benchmark that supports their point so maybe we shouldn't use treat them as evidence at all.

alt Hacker News