So what? I can probably produce parts of the header from memory. Doesn't mean my brain is GPLed.
The question was "if I train my model with copyleft material, how do you prove I did?"
If your brain was distributed as software, I think it might?
There is a stupid presupposition that LLMs are equivalent to human brains which they clearly are not. Stateless token generators are OBVIOUSLY not like human brains even if you somehow contort the definition of intelligence to include them
> So what? I can probably produce parts of the header from memory. Doesn't mean my brain is GPLed.
Your brain is part of you. Some might say it is your very essence. You are human. Humans have inalienable rights that sometimes trump those enshrined by copyright. One such right is the right to remember things you've read. LLMs are not human, and thus don't enjoy such rights.
Moreover, your brain is not distributed to other people. It's more like a storage medium than a distribution. There is a lot less furore about LLMs that are just storage mediums, and where they themselves or their outputs are not distributed. They're obviously not very useful.
So your analogy is poor.
not your brain, but the code you produce if it includes portions of GPL code that you remembered.