logoalt Hacker News

userbinatortoday at 4:29 AM1 replyview on HN

Full book content and model generations are not included because the books are copyrighted and the generations contain large portions of verbatim text.

There are plenty of old books in the public domain already... but I'm not sure what exactly this exercise is supposed to show, since the Kolmogorov limit still stands in the way of "infinite compression".


Replies

namenotrequiredtoday at 5:09 AM

> There are plenty of old books in the public domain already

Yes but showing that it happens in books in the public domain does nothing to prove that it happens for copyrighted books

show 1 reply