Whilst I'm generally not against piracy - I think there are valid reasons as to why it's a thing - I can't help but feel like disguising pirating content under the veil of fair use is a bit far fetched and disingenuous.
Using copyrighted material in a fair use way seems fine to me and is important. But these companies not wanting to pay for creating their model and then just claiming fair use is silly.
> these companies not wanting to pay for creating their model and then just claiming fair use
Certainly model developers would prefer not to pay if given the option, but I also feel it's not untruthful to say that it hasn't actually been feasible to license content on the scale required.
Even just for training an object detection network as a side project, I struggled to find sufficient pre-training material outside of web-scraped datasets like ImageNet. I even contacted Getty and was told directly that they don't license images for machine learning.
Something like a compulsory licensing scheme where you pay into a pot to train a model could potentially work. Mostly, I hope whatever we eventually get is feasible for open source groups, individual developers, universities, smaller companies, etc. rather than only being made with the few biggest companies in mind.