As opposed to an irregular person?
LLMs are not persons, not even legal ones (which itself is a massive hack causing massive issues such as using corporate finances for political gain).
A human has moral value a text model does not. A human has limitations in both time and memory available, a model of text does not. I don't see why comparisons to humans have any relevance. Just because a human can do something does not mean machines run by corporations should be able to do it en-masse.
The rules of copyright allow humans to do certain things because:
- Learning enriches the human.
- Once a human consumes information, he can't willingly forget it.
- It is impossible to prove how much a human-created intellectual work is based on others.
With LLMs:
- Training (let's not anthropomorphize: lossily-compressing input data by detecting and extracting patterns) enriches only the corporation which owns it.
- It's perfectly possible to create a model based only on content with specific licenses or only public domain.
- It's possible to trace every single output byte to quantifiable influences from every single input byte. It's just not an interesting line of inquiry for the corporations benefiting from the legal gray area.
Dude come on, I clearly wasn't saying LLMs are people. My point was it's a tool and it's the responsibility of the person wielding it to check outputs.
If it's too hard to check outputs, don't use the tool.
Your arguments about copyright being different for LLMs: at the moment that's still being defined legally. So for now it's an ethical concern rather than a legal one.
For what it's worth I agree that LLMs being trained on copyright material is an abuse of current human oriented copyright laws. There's no way this will just continue to happen. Megacorps aren't going to lie down if there's a piece of the pie on the table, and then there's precedent for everyone else (class action perhaps)