logoalt Hacker News

qarlyesterday at 7:57 PM2 repliesview on HN

I don't see your point. The problem is producing the copyrighted work, not processing it beforehand.

If it's illegal for AIs it should be illegal for humans, too. Is that really what you're arguing? It should be illegal for savants to read books?


Replies

SahAssaryesterday at 8:07 PM

I don't think anyone is arguing that the consumption is illegal. It's the reproduction that is illegal.

Read a book, that's fine. Write a book, that's fine. Read a book and then write a book that is 99.9% the same as the book that you read and sell it for profit without a license from the original author, that's infringement.

show 1 reply
Barrin92yesterday at 10:06 PM

>The problem is producing the copyrighted work, not processing it beforehand.

the distinction isn't particularly clear cut with an open source model. If it is able to reproduce copyright protected work with high fidelity such that the works produced would be derivative, that's like trying to get around laws against distribution of protected works by handing them to you in a zip file.

It's a kind of copyright washing to hand you the data as a binary blob and an algorithm to extract them out of it. That wouldn't really fly with any other technology.

And that's really where a lot of the value is mind you, these models are best thought of as lossily compressed versions of their input data. Otherwise Facebook ought to be perfectly fine to train them on public domain data.

show 1 reply