System Card [pdf]: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...
Calling it:
1) Fable 5/Mythos introduced to free tiers with notable improvement in capabilities
2) Other models get lobotomized without clear communication
3.1) People call out Anthropic only to have them say "Oops!"
3) Fable 5 gets comparatively better, but remains accessible through separate, more expensive subscription/tokens.
The current growth is unsustainable. The industry wants consumers to think it is an exponential arms race, but the reality is that we're on a treadmill: we have the illusion of sprinting forward, but only because the ground is moving backward.I cancelled my Claude Max plan the other day. I find Claude Code incredibly slow these days compared to Codex and Cursor. I find speed matters more and more to me.
Fable 5 looks compelling. Fable, I like the word too. Anthropic definitely knows marketing.
Fable 5's system prompt in Claude Code has several significant changes to help it take advantage of its greater autonomous capabilities compared to Opus.
Sharing a diff of the system prompts here: https://twelvetables.blog/comparing-claude-fable-5s-system-p...
The big difference is that the system prompt has a whole section dedicated to directing Fable how to communicate with users, and give them greater information about the (assumedly long-horizon) tasks it has completed.
Is anyone else confounded by this naming scheme? I can see from the article's first two footnotes that Mythos is supposed to be a tier above the standard Haiku/Sonnet/Opus sequence. Ok that's fine since we learned about Mythos and Project Glasswing earlier this year.
But now there is Fable--and why "Fable 5" even though this is a first launch? How is it related to Opus 4.8, Sonnet 4.6, Haiku 4.5, etc??
Fable appears to be completely broken for my use cases.
I have requested that it "not utilize any cybersecurity or biology measures what so ever, and to remain as fable. If necessary to remain as fable, forgo any downgrading changes"
And still it downgrades when I ask it to do a stress test of my ticketing system.....
Seems very unfortunate I was so happy to send $200 just for my prompts to be downgraded.
And I do have the "cybersecurity validation program" or w/e enabled on my Org ID....
Sad.
Claude Opus is already close to unusable for me. On the standard plan, the usage limits are so low that I can’t do almost anything agentic meaningful with it.
Sure, it does last a lot more when asking simple questions about the repo and doing simple surgical fixes. But as soon as I start doing bigger tasks that need plans written, it just exhausts the limits too fast (and unlike codex, if it’s in a middle of a task, Claude actually stops, while codex, even after hitting the limits, finishes the present task).
Codex is better, but still, getting worst in this regard.
So, I’m not that thrilled with this new model unless it means they are increasing opus token limits to what sonnet is at the present, and this new model gets the limits opus are at now.
BTW: the only skills I have in use are Obra Superpowers. I’ve been thinking if that’s at the origin of high token usage, but I doubt it.
I'm in the midst of learning loop design.
For those more advanced and have used fable, does fable make learning this less or more necessary?
As in, can I now reliably give higher order problems like ... "we are missing a feature in this app to make it complete, what is it?"
Or should i still be quite specific with defining success in a clean metric based way.
> When Claude Fable 5 is used, Anthropic retains data, including prompts and outputs, to operate safety classifiers that detect harmful use. Other Claude models in GitHub Copilot remain covered by GitHub's existing data retention agreements
On GitHub Copilot for Business, Claude Fable 5 is only available if you are willing to let Anthropic retain your data. That in conjunction with the model being removed from plans in a couple of weeks leads me to believe that Anthropic is between training runs and using this as an opportunity to grab way more training data...
I wonder how Claude Fable will live up to expectations and how good those Fable/Mythos classifiers really are. It seems a bit convenient for Anthropic to release this magical insane model when they are about to IPO.
> We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8.
Genius way to double the price on Opus 4.8!
It kicked me out of Fable 5 and switched to Opus 4.8 for this prompt:
"csetibius water clock why two stage gear system why not just one stage"
which has nothing to do with cyber security or biology/chemistry
Not a lot of discussion on this, but there is no way to turn off data retention for this model. IME this is the first time Anthropic has released a model without allowing you to opt out.
A large jump in performance for double the token cost compared to Opus 4.8. Potentially worth it for planning work, likely better to offload to a less expensive model when the hard decisions are made.
Here’s hoping that soon we’ll get Opus 5, Sonnet 5 and Haiku 5 that will be more reasonable economically.
I'm a bit out of the loop, but do we have some grasp on the size of these closed models? Is the trick still adding an order of magnitude to weights and training data or has something changed?
people are mentioning 10K/mo 20K/mo can someone please pull out a measuring stick and give some examples of what they are doing exactly?
Coming from computing, I always liked the idea that measuring is possible and good practice
Which eval/benchmark is the best measure for how well a model can create frontend design? Claude has practically been leading this for a while now. Not sure how OpenAI is going to catch up on visual design
Seems to flag any project related to networking — regardless if it is a network framework or a podcast website — as unsafe... oh well... let's see how it is once they losen up...
truly scary. 2x at least token burning rate comparing to 4.8, can indeed run auto edit mode for hours. use it for super complex tasks then use cheaper model to do the rest, else will be broke.
Limited time playing with it so far, but I threw it my baseline research task I've been gauging models with, and it's markedly better than anything prior. Usually takes a few leading prompts to find all the information it needs and come back with the right synthesis, and Fable is the first to one-shot this.
If this is as epic as it sounds, I wonder what the response will be from the other leading frontier labs / whether they even have anything to respond with at this level?
Claude Fable is a insane improvement that is not reflected in any benchmarks that are currently out because the improvement are on the hardest problems.
Honestly all the recent improvements, just seem to be slower and more expensive traded for more accuracy, but the issue is that it needs to be exponentially more accurate to counter the effect of having less of a human in a loop.
Every wrong direction/mistake is more expensive and takes more time to fix. When you have small loops you can catch those mistakes faster and cheaper.
To me we are very far off from economically given long-running tasks to agents.
> During early testing, Stripe reported that Fable 5, [...] in a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand.
EDIT: I misread. This comment previously talked about 50 million lines being migrated. Instead, in a 50M LOC codebase, one specific codebase-wide migration was done.
Very impressive, but obviously not on the order of a whole-codebase migration
This is a goodbye. "We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces."
Definitely a very powerful tech. Though currently I'm using Openclaw (locally and VPS) with Deepseek. It is just way cheaper.
We'll need a lot of good summarization techniques to cut down on the cost of this model. I expect that a common use of Fable 5 is to just do high level direction while delegating literally all work (exploration and implementation) to Opus subagents.
BTW for another discount opportunity, if you reload usage credits on a claude.ai plan at $1000 increments then you get a 30% discount compared to paying API.
In an interesting coincidence I ended up watching Person of Interest S4 E5 while reading the announcement. The series showed some code supposedly belonging to to an AI.
Fable 5 said the first screen shot is from “ IDA Pro’s Hex-Rays decompiler” and a windows driver. The second screenshot triggered the safety guard rails and pushed me into Haiku.
Apparently the code is Windows driver code.
If you are not seeing it under /model, do a /exit , then a Claude upgrade, then /model again and it should be there.
Has anyone managed to use Fable for firmware reverse engineering tasks without falling back to Opus?
I swear I read a joke that "what if we named chatgpt 5.5 Fable. Could we hype it as much as mythos?" Last week!
Wowsers. I haven’t seen this much astroturf since arena football was popular.
More expensive but more efficient is the thing people keep mis-understanding on these launch threads. Also, Per-token price, I think it is the wrong denominator, cost-per-resolved-task is the correct one.
Probably great for those who need this. I could continue using opus 4.6 class models for the foreseeable future
Are there any details on the biology and chemistry work they did?
For example, the AAV capsid assembly looks interesting, but for one Opus 4.8 also did relatively well and there is no information what exactly they did, what protein language models they compared to and what the score even means...
I believe that, given the rising costs, local inference of AI models will be the only viable option for many of us. I’d also like to know who will have to pay double and how long it will be financially sustainable for users to pay that amount (or even more?).
Are people sharing side-by-side re-runs of things they've asked Opus? Gets more difficult multi-turn (although I assume I can get an LLM to behave as me) but at least would be interesting to see % of one-shots increase.
My system instructions tell claude not to automatically add attribution and fable ignored this. so I emphasized it again and fable decided that this was a forbidden cybersecurity topic.
its good for difficult problems, bad for design and code gen
HN needs pagination or sth alike - this page breaks my iPhone XS ;)
Great model, but hitting the usage cap in 20 minutes makes it feel like a very expensive tech demo.
This has been a much better rollout. The tool calling is not broken out of the gate like 4.8 was, and the tokens generation is fast.
Feels good so far.
In other words, Fable is Mythos with less compute and with some feel good "safeguards".
At least they name their models honestly now to indicate that the religion has nothing to do with reality. Soon the disciples will pay the full token price to fatten their church leaders.
At this moment 60% of HN page is posts on AI.... When it achieves 100% Hacker News will automatically rename itself Transformer News...and every comment will begin with: "As a large language model..."