logoalt Hacker News

hn_acc1yesterday at 8:29 PM0 repliesview on HN

And then you let them train themselves and no one notices when they "accidentally" remove the guardrail prompts from the next version. And another 10 years later, almost no one remembers how "The Guardian" learns new things or how to stop it from being evil.