After Anthropic wholesale pirated millions of books, and got only a slap on the wrist and no jail time, and Meta did almost the same, I've decided that "Anna's" plus used physical books plus printed new books are the right combination.
It's kinda poetic that in the entire process, authors get screwed thrice. First, by the publishers and retailers, who keep 80% of the revenue. Then, by the hacker culture that enables widespread, institutionalized book piracy for the sake of "information wanting to be free". And finally, by the same hacker culture gone corporate, where "grown-up" geeks conclude that, since we already have all these pirated works out there, what's the harm of training LLMs on that.
Music is quite similar, and I've actually seen piracy justified by saying that "eh, the musicians are screwed either way". And of course, that piracy enabled suno.ai, which is now making sure that the musicians are really screwed.
> After Anthropic wholesale pirated millions of books
It's so much worse, they've literally destroyed real physical books in the hopes of that helping them "workaround" copyright, which we "regular" citizens need to comply with: https://arstechnica.com/ai/2025/06/anthropic-destroyed-milli...
I guess it depends by your definition of "worse", the process of buying books and destroying them was considered "transformative" enough to be considered legal, while Anthropic later did piracy and kind of legally undermined the whole book scanning operation.