logoalt Hacker News

ted_dunningtoday at 5:28 PM1 replyview on HN

Are you saying that you have not observed these things in the world? I definitely have. The blog didn't do the work for you, but if we look at some of the claims I think it is pretty clear:

a) increased training scale would result in highly fluent systems that would fool users into trusting untrustworthy output.

Can you possibly be claiming that this is not a common experience? Do you really need references to the legal cases which had hallucinated legal theories and citations? Or the utter slop being passed off as research papers?

b) large-scale AI would amplify bias in the source material.

The large investments nearly every frontier model development team spends on this problem is probably good enough evidence. Grok is another point of evidence. The studies showing that AI systems imitate gender bias in evaluating resumes is another. The gender bias in estimating names of people in sentences is another.

The blog actually mentions specific cases that exhibited all of these problems. They did not cite references for them, but you can use a search engine.

c) environment costs

This is widely discussed and documented. Take Xai's use of polluting turbine generators for their data center in for Collossus 2 in Mississippi as just a single example. Do you really need a reference for the environmental impact of the proposed data center in Utah that (as planned) will consume more energy than the entire state currently does?

d) training set audits are impossible.

Do you need substantiation of the inappropriate imagery in training data? The blog gives you a pretty solid reference.

... and so on ...

I suppose that it could be true that when you say "I don't see" you really meant "I didn't look at the blog". Is that why you can't see the substantiation?


Replies

staticman2today at 7:06 PM

Thanks for the reply.

I'm a little confused on what is being claimed. The Tumblr article says:

"That healthcare triage tools would underperform on Black patients. That loan approval systems would entrench inequality while presenting their decisions as neutral algorithmic judgment."

Are we talking about language models? Was a lender using a language model?

The paper cited is about language models.

Apparently stable diffusion contained some bad images. The paper title is again, language models. (That stable diffusion claim is weird too. Someone warned us there's too much data to audit then someone audited the data and removed the bad data so the paper is correct?)

Grok is intentionally biased, so I don't think the bad generations are due to amplying the training data, necessarily.

And it's also not clear that manual auditing of training data would ensure anything is safe. Wouldn't models still have plenty of examples of bad behavior from the news?

On bias you wrote:

"The large investments nearly every frontier model development team spends on this problem is probably good enough evidence."

I thought the claim was a bad thing is happening we were warned about.

You are saying the fact they invest in safety means the models are not safe?

Does that mean Anthropic and OpenAI can prove they are safe by firing all the safety researchers?

Also:

"Researchers studying low-resource languages have documented active degradation in translation quality, because the synthetic content fed back into training is itself worse in those languages."

Who knows what this is referring to? I'm not going to search for it but I wouldn't be surprised if it's comedically off point.