To my understanding, if the material is publicly available or obtained legally (i.e., not pirated), then training a model with it falls under fair use.
Once training is established as fair use, it doesn't really matter if the license is MIT, GPL, or a proprietary one.
> To my understanding, if the material is publicly available or obtained legally (i.e., not pirated), then training a model with it falls under fair use.
Is this legally settled?
That is just the sort of point I am trying to make. That is a copyright law issue, not a contractual one. If the GPL is a contract then you are in breach of contract regardless of fair use or equivalents.
fair use only applies in the united states (and Poland, and a very limited set of others)
https://en.wikipedia.org/wiki/Fair_use#/media/File:Fair_use_...
and it is certainly not part of the Berne Convention
in almost every country in the world even timeshifting using your VCR and ripping your own CDs is copyright infringement