logoalt Hacker News

probably_wrongtoday at 3:01 PM0 repliesview on HN

I don't think your example works because in the book case there's a clear author whose ideas are being reproduced without permission. The LLM in your example is not the author but rather the printing press, and no one would argue that the printing press' ideas are being stolen because the press doesn't have any.

If one want to argue that "not citing the LLM would be plagiarism" then we would have to find the human at the end of the chain whose ideas are being reproduced, which would require LLMs to output "this idea was seen in the following training documents".