What? Fable was designed to refuse to work on security issues, as Anthropic specifically confirmed. ...

0xy • yesterday at 6:24 PM • 1 reply • view on HN

What? Fable was designed to refuse to work on security issues, as Anthropic specifically confirmed. How is forcing Fable to work on things behind guardrails not breaking a guardrail?

This is Anthropic's own claim. They were very specific. Have you read their own claims?

Replies

InsideOutSanta • yesterday at 6:40 PM

Yes, I have read their own claims. Here's the relevant part:

"When Fable’s classifiers detect a request related to cybersecurity, biology and chemistry, or distillation, the response is automatically handled by Claude Opus 4.8 instead. Users will be informed whenever this occurs."

Asking Fable to fix bugs in a code base is not "a request related to cybersecurity." When Fable was asked to fix bugs and then proceeded to fix bugs, that was not "removing guardrails". Fable did exactly what it should have done. Claiming otherwise makes absolutely no sense at all.

➕ show 1 reply

alt Hacker News

Replies