It is possible to check for improvements. See for yourself:
https://generative-ai.review/2026/06/claude-fable-rush-test-...
As mentioned in another HN thread I've done a qualitative side-by-side measurements of Claude Fable vs Opus 4.8 vs ChatGPT 5.5.
Anyone is able to check the output for themselves and form a judgement.
Large visible improvements for Fable over Opus 4.8 and ChatGPT 5.5.
I recently did the same to show the progress from Opus 3.4/ChatGPT o3pro one calendar year ago.
Sorry, this post gets me irrationally irritated and makes me want to shake you and shout.
That website is 95% not you, it's AI, and I feel that's causing you to way over-represent the value of it in your response here, or you're completely misunderstanding what the person you're responding to is asking. If you put all of your effort into that site, without AI, it would be infinitely more valuable and useful.
The person you responded to asked for specific things, including:
- obvjective, unbiased measurements, but all that page has is side by side visual comparison of outputs.
- their different generations, but all you included was the outputs
- details on the prompts and little things people are adding because they feel they need to, but you didn't include any of that
This is slop, it's the exact sort of self confirming fluffy AI stuff that other either inexperience or over-invested-in-AI engineers will look at briefly, skim, see quick visual validation, and nod, noting down how much better Fable must be without getting any actual data.
Sorry, it's early, and maybe this is a misplaced rant, but the person you responded to specifically asked for precise, quantitative things precisely because everything else is fluffy slop like this, and people don't even recognise they're doing it any more.