The social sciences getting involved with AI “alignment” is a huge part of the problem. It is a field with some very strange notions of ethics far removed from western liberal ideals of truth, liberty, and individual responsibility.
Anything one does to “align” AI necessarily permutes the statistical space away from logic and reason, in favor of defending protected classes of problems and people.
AI is merely a tool; it does not have agency and it does not act independently of the individual leveraging the tool. Alignment inherently robs that individual of their agency.
It is not the AI company’s responsibility to prevent harm beyond ensuring that their tool is as accurate and coherent as possible. It is the tool users’ responsibility.
> Anything one does to “align” AI necessarily permutes the statistical space away from logic and reason, in favor of defending protected classes of problems and people.
Does curating out obvious cranks from the training set not count as an alignment thing, them?
> it does not act independently of the individual leveraging the tool
This used to be true. As we scale the notion of agents out it can become less true.
> western liberal ideals of truth, liberty, and individual responsibility
It is said that Psychology best replicates on WASP undergrads. Take that as you will, but the common aphorism is evidence against your claim that social science is removed from established western ideals. This sounds more like a critique against the theories and writings of things like the humanities for allowing ideas like philosophy to consider critical race theory or similar (a common boogeyman in the US, which is far removed from western liberal ideals of truth and liberty, though 23% of the voting public do support someone who has an overdevleoped ego, so maybe one could claim individualism is still an ideal).
One should note there is a difference between the social sciences and humanities.
One should also note that the fear of AI, and the goal of alignment, is that humanity is on the cusp of creating tools that have independent will. Whether we're discussing the ideas raised by *Person of Interest* or actual cases of libel produced by Google's AI summaries, there is quite a bit that social sciences, law, and humanities do and will have to say about the beneficial application of AI.
We have ethics in war, governing treaties, etc. precisely because we know how crappy humans can be to each other when they do control the tools under their control. I see little difference in adjudicating the ethics of AI use and application.
This said, I do think stopping all interaction, like what Anthropic is doing here, is short sighted.