logoalt Hacker News

cogman10today at 12:29 AM1 replyview on HN

If there was going to be a case, it's derivative works. [1]

What makes it all tricky for the courts is there's not a good way to really identify what part the generated code is a derivative of (except in maybe some extreme examples).

[1] https://en.wikipedia.org/wiki/Derivative_work


Replies

felipeeriastoday at 3:57 AM

One could carefully calculate exactly how much a given document in the training set has influenced the LLM's weights involved in a particular response.

However, that number would typically be very very very very small, making it hard to argue that the whole model is a derivative of that one individual document.

Nevertheless, a similar approach might work if you took a FOSS project as a whole, e.g. "the model knows a lot about the Linux kernel because it has been trained on its source code".

However, it is still not clear that this would be necessarily unlawful or make the LLM output a derivative work in all cases.

It seems to me that LLMs are trained on large FOSS projects as a way to teach them generalisable development skills, with the side effect of learning a lot about those particular projects.

So if I used a LLM to contribute to the kernel, clearly it would be drawing on information acquired during its training on the kernel's code source. Perhaps it could be argued that the output in that case would be a derivative?

But if I used a LLM to write a completely unrelated piece of software, the kernel training set would be contributing a lot less to the output.