logoalt Hacker News

tpmoneytoday at 8:07 PM1 replyview on HN

If you download GPL source code and run `wc` on its files and distribute the output of that, is that a violation of copyright and the GPL? What if you do that for every GPL program on github? What if you use python and numpy and generate a list of every word or symbol used in those programs and how frequently they appear? What if you generate the same frequency data, but also add a weighting by what the previous symbol or word was? What if you did that an also added a weighting by what the next symbol or word was? How many statistical analyses of the code files do you need to bundle together before it becomes copyright infringement?


Replies

sfinktoday at 9:15 PM

The line is somewhere between running wc on the entire input and running gzip on the entire input.

The fact that a slippery slope is slippery doesn't make it not a slope.