We don't know how much aware of the problems (or of tbe likelihood that they'd occur) OpenAI was, and how much they deliberately pushed through.
If they were and did, they sure bear responsibility for what happened
What if OpenAI knew responses like this were likely, but also knew preventing them would degrade overall model quality?
I'm being selfish here! I am confident that no AI model will convince me to harm myself, and I don't want the models I use to be hamstrung.
What if OpenAI knew responses like this were likely, but also knew preventing them would degrade overall model quality?
I'm being selfish here! I am confident that no AI model will convince me to harm myself, and I don't want the models I use to be hamstrung.