They do memorize some books. You can test this trivially by asking ChatGPT to produce the first cha...

empath75 • yesterday at 3:52 PM • 1 reply • view on HN

They do memorize some books. You can test this trivially by asking ChatGPT to produce the first chapter of something in the public domain -- for example a Tale of Two Cities. It may not be word for word exact, but it'll be very close.

These academics were able to get multiple LLMs to produce large amounts of text from Harry Potter:

https://arxiv.org/abs/2601.02671

Replies

threethirtytwo • yesterday at 3:56 PM

In that case I would say it is the act of reproducing the books that is illegal. Training the AI on said books is not.

So the illegality rests at the point of output and not at the point of input.

I’m just speaking in terms of the technical interpretation of what’s in place. My personal views on what it should be are another topic.

➕ show 1 reply

alt Hacker News

Replies