logoalt Hacker News

zaptheimpaleryesterday at 12:53 AM0 repliesview on HN

Meta and Anthropic atleast fed the entire copyrighted books into the training. Not the wikipedia page, not a plot summary or some tropes, they fed the entire original book into training. They used atleast the entirety of LibGen which is a pirated dataset of books.