logoalt Hacker News

0xyyesterday at 6:24 PM1 replyview on HN

What? Fable was designed to refuse to work on security issues, as Anthropic specifically confirmed. How is forcing Fable to work on things behind guardrails not breaking a guardrail?

This is Anthropic's own claim. They were very specific. Have you read their own claims?


Replies

InsideOutSantayesterday at 6:40 PM

Yes, I have read their own claims. Here's the relevant part:

"When Fable’s classifiers detect a request related to cybersecurity, biology and chemistry, or distillation, the response is automatically handled by Claude Opus 4.8 instead. Users will be informed whenever this occurs."

Asking Fable to fix bugs in a code base is not "a request related to cybersecurity." When Fable was asked to fix bugs and then proceeded to fix bugs, that was not "removing guardrails". Fable did exactly what it should have done. Claiming otherwise makes absolutely no sense at all.

show 1 reply