logoalt Hacker News

SR2Ztoday at 1:27 AM1 replyview on HN

But then someone would figure out some prompts that don't trigger this, and Anthropic wouldn't be able to try and disadvantage competitors.


Replies

BoorishBearstoday at 2:43 AM

Except they openly reject many many other classes of prompts, including extremely high stakes CBRN.

It's only the direction that has direct potential business impact they've decided to sabotage instead of reject.