logoalt Hacker News

yorwbatoday at 7:29 AM0 repliesview on HN

Everybody training models on large amounts of lightly filtered internet text is partially distilling every other model that had its output posted verbatim to the internet.