It does make sense. It’s controversial. Your memory memorizes things in the same way. So what nvidia does here is no different, the AI doesn’t actually copy any of the books. To call training illegal is similar to calling reading a book and remembering it illegal.
Our copyright laws are nowhere near detailed enough to specify anything in detail here so there is indeed a logical and technical inconsistency here.
I can definitely see these laws evolving into things that are human centric. It’s permissible for a human to do something but not for an AI.
What is consistent is that obtaining the books was probably illegal, but say if nvidia bought one kindle copy of each book from Amazon and scraped everything for training then that falls into the grey zone.
> To call training illegal is similar to calling reading a book and remembering it illegal.
A type of wishful thinking fallacy.
In law scale matters. It's legal for you to possess a single joint. It's not legal to possess 400 tons of weed in a warehouse.
You can only read the book, if you purchased it. Even if you dont have the intent to reproduce it, you must purchase it. So, I guess NVDA should just purchase all those books, no?
But to train the models they have to download it first (make a copy)
But it’s not just about recall and reproduction. If they used Anna’s Archive the books were obtained and copied without a license, before they were fed in as training data.
> To call training illegal is similar to calling reading a book and remembering it illegal.
Perhaps, but reproducing the book from this memory could very well be illegal.
And these models are all about production.