If by "body-slammed" you mean "trained on SO user data while violating the terms of t...

imiric • last Saturday at 11:35 PM • 1 reply • view on HN

If by "body-slammed" you mean "trained on SO user data while violating the terms of the CC BY-SA license", then sure.

In the best case scenario, LLMs might give you the same content you were able to find on SO. In the common scenario, they'll hallucinate an answer and waste your time.

What should worry everyone is what system will come after LLMs. Data is being centralized and hoarded by giant corporations, and not shared publicly. And the data that is shared is generated by LLMs. We're poisoning the well of information with no fallback mechanism.

Replies

Dylan16807 • last Sunday at 12:28 AM

> If by "body-slammed" you mean "trained on SO user data while violating the terms of the CC BY-SA license", then sure.

You know that's not what they meant, but why bring up the license here? If they were over the top compliant, attributing every SO answer under every chat, and licensing the LLM output as CC BY-SA, I think we'd still have seen the same shift.

> In the best case scenario, LLMs might give you the same content you were able to find on SO. In the common scenario, they'll hallucinate an answer and waste your time.

Best case it gives you the same level of content, but more customized, and faster.

SO being wrong and wasting your time is also common.

alt Hacker News

Replies