I once thought the same about all the copyrighted works on which LLMs are currently trained. Surely...

rectang • today at 4:36 PM • 0 replies • view on HN

I once thought the same about all the copyrighted works on which LLMs are currently trained. Surely they can't just hoover everything up? Haha, silly me.

I understand that creating an LLM itself is transformative, but an LLM trained on copyrighted works remains capable of generating derivative works, which eventually will result in successful copyright lawsuits against LLM users who redistribute those derivative works.

In advance of that day, the great race is to build a licensed corpus as aggressively as possible (see Github's latest decision to opt in Copilot usage). Even if Blender doesn't send your data on every save, various options can be developed, such as publishing to a Blender-controlled public channel.

alt Hacker News