logoalt Hacker News

jxjnskkzxxhxlast Wednesday at 7:54 AM2 repliesview on HN

Do you have a reason to believe this ain't already being done? I would assume that the big guys like openai are already training on basically all text in existence.


Replies

IlikeKittieslast Wednesday at 8:25 AM

In fact, facebook torrented annas archive and got busted for it, because of course they did:

https://torrentfreak.com/meta-torrented-over-81-tb-of-data-t...

show 1 reply