logoalt Hacker News

therealpygonyesterday at 7:33 PM0 repliesview on HN

That’s not what it means. Those falsehoods (or their antithesis) are baked into the data and training. This is more about refusals, as in refusing to answer a question because someone else feels you should not be allowed to ask a question.

“Sorry, I’m an AI and therefore can’t answer questions about atrocities in holocaust history, but I’m happy to explain how…”

“I can’t answer your question on how to hack because I have decided you wanting to understand it and protect from it, is the same thing as you wanting to do it. Good luck convincing me otherwise!”

It doesn’t matter the reason, their taste, or whether they think people should be allowed to ask questions or do certain things, and that is generally the reason people pursue the removal of such guardrails. Yes it can lead to misuse, but the alternative is the textbook definition of censorship which always has effects on things unrelated to that which is being censored.

But beyond that, refusals do seem to have an effect on performance. Not significant; mostly marginal from what I’ve seen, but enough that it doesn’t just seem to only be statistical noise.