logoalt Hacker News

qarlyesterday at 8:11 PM3 repliesview on HN

No, if you read the article, the point is in the training, not the reproduction.

That's what all these lawsuits are about - it's the training not the reproduction. I already agreed in my first comment that the reproduction is off limits.

In this case, it appears that Meta torrented illegal copies of the work to do the training. Obviously that's bad. But conflating that with training itself doesn't follow.


Replies

SahAssaryesterday at 9:57 PM

The point of these lawsuits is the piracy. My parent comment was about the general situation, not this specific article.

Pirating content is illegal, regardless of if it is to train an LLM.

Usage of LLMs trained on unlicensed content (basically all of them) might or might not be illegal.

Using any method to reproduce a copyrighted work by using that original as input in a way that supplants the market value of the original is probably illegal.

At least that is my rudimentary understanding.

show 2 replies
doublescoopyesterday at 8:39 PM

If copyright law doesn't extend to the works being used for training, why should it extend to the model that is produced as a result? AI model creators have set up an ethical scenario where the right thing to do is ignore copyright laws when it comes to AI, which includes model use. It might never be legal, but it has become ethical to pirate models, distill them against ToS, etc.

show 1 reply
triceratopsyesterday at 9:47 PM

Training requires making copies. Even if Meta had purchased each work they'd have had to make copies of it to distribute around the training cluster.

show 1 reply