Essentially LLMs are recontextualizing their training data. So on one hand, one might argue that training is like a human reading books and then inference is like writing something novel, (partially) based on the reading experience. But the contract between humans considers it plagiarism when we recite some studied text and then claim it as your own. So for example, books attribute citations with footnotes.
With source code we used to either re-used a library as-is, in which case the license terms would apply OR write our own implementation from scratch. While this LLM recontextualization purports to be like the latter, it is sometimes evident that the original license or at least some attribution, comment or footnote should apply. If only to help with future legibility maintenance.
I think the attribution is a very good point!
Essentially LLMs are recontextualizing their training data. So on one hand, one might argue that training is like a human reading books and then inference is like writing something novel, (partially) based on the reading experience. But the contract between humans considers it plagiarism when we recite some studied text and then claim it as your own. So for example, books attribute citations with footnotes.
With source code we used to either re-used a library as-is, in which case the license terms would apply OR write our own implementation from scratch. While this LLM recontextualization purports to be like the latter, it is sometimes evident that the original license or at least some attribution, comment or footnote should apply. If only to help with future legibility maintenance.