Thanks for the reply.
I'm a little confused on what is being claimed. The Tumblr article says:
"That healthcare triage tools would underperform on Black patients. That loan approval systems would entrench inequality while presenting their decisions as neutral algorithmic judgment."
Are we talking about language models? Was a lender using a language model?
The paper cited is about language models.
Apparently stable diffusion contained some bad images. The paper title is again, language models. (That stable diffusion claim is weird too. Someone warned us there's too much data to audit then someone audited the data and removed the bad data so the paper is correct?)
Grok is intentionally biased, so I don't think the bad generations are due to amplying the training data, necessarily.
And it's also not clear that manual auditing of training data would ensure anything is safe. Wouldn't models still have plenty of examples of bad behavior from the news?
On bias you wrote:
"The large investments nearly every frontier model development team spends on this problem is probably good enough evidence."
I thought the claim was a bad thing is happening we were warned about.
You are saying the fact they invest in safety means the models are not safe?
Does that mean Anthropic and OpenAI can prove they are safe by firing all the safety researchers?
Also:
"Researchers studying low-resource languages have documented active degradation in translation quality, because the synthetic content fed back into training is itself worse in those languages."
Who knows what this is referring to? I'm not going to search for it but I wouldn't be surprised if it's comedically off point.