logoalt Hacker News

1718627440last Saturday at 1:53 PM1 replyview on HN

You take the copyrighted work to the printer, you don't upload data to an LLM first, it is already in the machine. If you got LLMs without training data (however that works) and the user needs to provide the data, then it would be ok.


Replies

CamperBob2last Saturday at 3:57 PM

You don't "upload" data to an LLM, but that's already been explained multiple times, and evidently it didn't soak in.

LLMs extract semantic information from their training data and store it at extremely low precision in latent space. To the extent original works can be recovered from them, those works were nothing intrinsically special to begin with. At best such works simply milk our existing culture by recapitulating ancient archetypes, a la Harry Potter or Star Wars.

If the copyright cartels choose to fight AI, the copyright cartels will and must lose. This isn't Napster Part 2: Electric Boogaloo. There is too much at stake this time.