The trouble is the silence, not Anthropic setting guardrails. Claude saying "I'm sorry, I ...

dannyw • today at 8:43 AM • 0 replies • view on HN

The trouble is the silence, not Anthropic setting guardrails. Claude saying "I'm sorry, I can't assist further because it looks like you're [XYZ]" is fine.

We all know the false positive rates for classifiers on Fable. Imagine being a ML researcher working on any kind of ML/AI project that isn't against their ToS, and having your codebase poisoned and sabotaged silently.

alt Hacker News