> LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assum...

leptons • yesterday at 12:34 AM • 4 replies • view on HN

> LLMs don’t “learn” from the information they operate on, contrary to what a lot of people assume.

Nothing is really preventing this though. AI companies have already proven they will ignore copyright and any other legal nuisance so they can train models.

Replies

lioeters • yesterday at 12:39 AM

They're already using synthetic data generated by LLMs to further train LLMs. Of course they will not hesitate to feed "anonymized" data generated by user interactions. Who's going to stop them? Or even prove that it's happening. These companies have already been allowed to violate copyright and privacy on a historic global scale.

Archelaos • yesterday at 12:42 AM

How should they dinstinguish between real and fake data? It would be far to easy to pollute their models with nonesense.

➕ show 1 reply

tick_tock_tick • yesterday at 12:48 AM

I mean is it really ignoring copyright when copyright doesn't limit them in anyway on training?

➕ show 1 reply

Aurornis • yesterday at 1:52 AM

> Nothing is really preventing this though

The enterprise user agreement is preventing this.

Suggesting that AI companies will uniquely ignore the law or contracts is conspiracy theory thinking.

➕ show 1 reply

alt Hacker News

Replies