Song lyrics. Not illegal. I can google them and see them directly on Google. LLMs refuse.
>Not illegal
Reproducing a copyrighted work 1:1 is infringing. Other sites on the internet have to license the lyrics before sending them to a user.
It actually works the same as on google. As in, ChatGPT will happily give you a link to a site with the lyrics without issue (regardless whether the third party site provider has any rights or not). But in the search/chat itself, you can only see snippets or small sections, not the entire text.
Related, GPT refuses to identify screenshots from movies or TV series.
Not for any particular reason, it flat out refuses. I asked it whether it could describe the picture for me in as much detail as possible, and it said it could do that. I asked it whether it could identify a movie or TV series by description of a particular scene, and it said it could do that, but that if I'd ever try or ask it to do both, it wouldn't do that cause it'd be circumvention of its guide lines! -- No it doesn't quite make sense, but to me it does seem quite indicative of a hard-coded limitation/refusal, because it is clearly able to do the sub tasks. I don't think the ability to identify scenes from a movie or TV show is illegal or even immoral, but I can imagine why they would hard code this refusal, because it'd make it easier to show it was trained on copyrighted material?
While the issue is far from settled, OpenAI recently lost a trial in German court regarding their usage of lyrics for training:
https://news.ycombinator.com/item?id=45886131