logoalt Hacker News

bluefirebrandyesterday at 11:07 PM2 repliesview on HN

> I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that

Frankly do you think AI companies have even the remotest amount of respect for these licenses anyways? They will simply take your code if it is publicly scrapeable, train their models, exactly like they have so far. Then it will be up to you to chase them down and try to sue or whatever. And good luck proving the license violation

I dunno. I just don't really believe that many tech companies these days are behaving even remotely ethically. I don't have much hope that will change anytime soon


Replies

apatheticonionyesterday at 11:35 PM

I wonder if there is a "loaded" lawsuit here that could be a win-win for license enforcement case law in LLMs.

Take a litigious company like Nintendo. If one was to train an LLM on their works and the LLM produces an emulator, that would force a lawsuit.

If Nintendo wins, then LLMs are stealing. If Nintendo loses, then we can decompile everything.

show 1 reply
archagontoday at 12:58 AM

Traditionally, large corporations have taken very conservative legal stances with regard to integrating e.g. A/GPL code, even when there's almost no risk.

If my license explicitly says "any LLM output trained on this code is legally tainted," I feel like BigAICorp would be foolish to ignore it. Maybe I couldn't sue them today, but are they confident this will remain the case 5, 10, 20 years from now? Everywhere in the world?