logoalt Hacker News

rpdillontoday at 1:02 AM2 repliesview on HN

My personal take is that LLMs are so transformative that they are likely not going to qualify under derivative works and therefore GPL wouldn't hold sway. There's already some evidence that courts will consider training on copyrighted material fair use, so long as it is otherwise obtained legally, which would be the case with software licensed under GPL.

I realize this is an unpopular opinion on HN, but I believe it is best because it's a weakener interpretation of copyright law, which is overall a good thing in my view.


Replies

hparadiztoday at 2:34 AM

You can train models locally now and use open source ones and there's a robust community of people training, retraining, and generally pulling data from anywhere. And then new models get trained on old models. The models in use now are already several generations deep even further trained on code freely given by the entire industry. It's like complaining about being 1/100000th of a soup with no real proof you're even in it. Can you provide proof that a model used your code? It's like a remix of a remix of a remix.

show 1 reply
apatheticoniontoday at 1:16 AM

I wonder if that would extend to training LLMs on decompiled firmware. A new clean room method unlocked?