logoalt Hacker News

extryesterday at 10:09 PM1 replyview on HN

I'm a big fan of Anthropic. Just check my post history. I've been accused of working there. But this is complete bullshit and they need to get real. Silent sandbagging is not acceptable, especially given they've shown with this release their safety filters have HUGE amounts of false positives.


Replies

zzleeperyesterday at 10:43 PM

It's increasingly obvious that the only safeguard we got is open models and semi open ones like from China. Crazy world