logoalt Hacker News

threethirtytwoyesterday at 3:05 PM6 repliesview on HN

It does make sense. It’s controversial. Your memory memorizes things in the same way. So what nvidia does here is no different, the AI doesn’t actually copy any of the books. To call training illegal is similar to calling reading a book and remembering it illegal.

Our copyright laws are nowhere near detailed enough to specify anything in detail here so there is indeed a logical and technical inconsistency here.

I can definitely see these laws evolving into things that are human centric. It’s permissible for a human to do something but not for an AI.

What is consistent is that obtaining the books was probably illegal, but say if nvidia bought one kindle copy of each book from Amazon and scraped everything for training then that falls into the grey zone.


Replies

ckastneryesterday at 3:19 PM

> To call training illegal is similar to calling reading a book and remembering it illegal.

Perhaps, but reproducing the book from this memory could very well be illegal.

And these models are all about production.

show 2 replies
lelanthranyesterday at 3:21 PM

> To call training illegal is similar to calling reading a book and remembering it illegal.

A type of wishful thinking fallacy.

In law scale matters. It's legal for you to possess a single joint. It's not legal to possess 400 tons of weed in a warehouse.

show 2 replies
kalap_uryesterday at 3:51 PM

You can only read the book, if you purchased it. Even if you dont have the intent to reproduce it, you must purchase it. So, I guess NVDA should just purchase all those books, no?

show 2 replies
_trampeltieryesterday at 4:56 PM

But to train the models they have to download it first (make a copy)

show 1 reply
godelskiyesterday at 5:01 PM

You need to pay for the books before you memorize them

show 1 reply
Nursieyesterday at 4:26 PM

But it’s not just about recall and reproduction. If they used Anna’s Archive the books were obtained and copied without a license, before they were fed in as training data.