What section of the ToS prohibits this? In other words, what is the thing that is being done that is against the ToS? Looking up the creator of a repo, or the contributors of the repo?
I did a quick scan of the ToS and all I could find was D8 that states that autmated access (scraping) used for "AI" applies a reciprocal license that prevents the scraper from restricting GitHub's access to the data (the whole model? the weights?) resulting from the scraping.
This makes it sound like any model trained on GitHhub content cannot be commercialized, because charging for access to the output would be a "technical or other limit"... So you're obviously not really enforcing this, otherwise MS would be suing every big commercial model out there!
It seems like a safe assumption that the big commercial models will have negotiated their own private GitHub terms of service, especially considering their many-digit annual contracts with Azure.