> On the broader point, I hear you, but I think there's a middle ground. Not all content is public knowledge. Some of it is premium, proprietary, or behind a paywall. The people publishing it should get to decide whether it becomes free training data.
I don't follow. Are you suggesting that someone is scraping private sites that they have to log in on in order to train AI on it?