so basically we download the sources files to the training weight and remove the LICENSE.MD as it's exactly the same as a person learning to program from proprietay secret code and outputing code based on that for millions of peoples in matter of seconds /s
we also treat as however we want public goods found over the internet. as the World Intellectual Property Organization Copyright Treaty and Berne Convention for the Protection of Literary and Artistic Works aren't real or because we can as we are operating in international waters, selling products for other sails living exclusively in international waters /s
If you download GPL source code and run `wc` on its files and distribute the output of that, is that a violation of copyright and the GPL? What if you do that for every GPL program on github? What if you use python and numpy and generate a list of every word or symbol used in those programs and how frequently they appear? What if you generate the same frequency data, but also add a weighting by what the previous symbol or word was? What if you did that an also added a weighting by what the next symbol or word was? How many statistical analyses of the code files do you need to bundle together before it becomes copyright infringement?