logoalt Hacker News

hiuioejfjkftoday at 12:00 AM1 replyview on HN

Director of Safety and Alignment at Meta gives full access to a LLM to theirs email

after anthropic publishes research how a model tried to blackmail an executive with emails about an affair to not be shut down

and justification in thread is "I tried it on a toy inbox, it worked well, so I trusted it with my real email"

CLOWN WORLD


Replies

blibbletoday at 12:04 AM

pretty clear the facebook safety and alignment role is just for show if she couldn't figure this out

its like they hired the worst person they could get their hands on

show 2 replies