logoalt Hacker News

mitthrowaway2today at 4:24 PM1 replyview on HN

There's an argument by which machine-learned neural network weights are a lossy compression of (as well as a smooth interpolator over) the training set.

An mp3 file is also a machine-generated lossy compression of a cd-quality .wav file, but it's clearly copyrightable.

To that extent, the main difference between a neural network and an .mp3 is that the mp3 compression cannot be used to interpolate between two copyrighted works to output something in the middle. This is, on the other hand, perhaps the most common use case for genAI, and it's actually tricky to get it to not output something "in the middle" (but also not impossible).

I think the copyright argument could really go either way here.


Replies

littlestymaartoday at 4:50 PM

> An mp3 file is also a machine-generated lossy compression of a cd-quality .wav file, but it's clearly copyrightable.

Not the .mp3 itself, the creative piece of art that it encode.

You can't record Taylor Swift at a concert and claim copyright on that. Nor can you claim copyright on mp3 re-encoded old audio footage that belong to the public domain.

Whether LLMs are in the first category (copyright infringement of copyright holders of the training data) or in the second (public domain or fair use) is an open question that jurisprudence is slowly resolving depending on the jurisdiction, but that doesn't address the question of the weight themselves.

show 1 reply