Its license washing. The code is great because its already a problem solved by someone else. The AI can spit out the solution with no license and no attribution and somehow its legal. I hope American tech legislation holds that same energy once others start taking American IP and spitting it back out with no license or attribution.
I've seen many discussions stating patent hoarding has gone too far, and also that copyright for companies have gone way too far (even so much that Amazon can remove items from your purchase library if they lose their license to it).
Then AI begins to offer a method around this over litigious system, and this becomes a core anti-AI argument.
I do think it's silly to think public code (as in, code published to the public) won't be re-used by someone in a way your license dictates. I'd you didn't want that to happen, don't publish your code.
Having said that, I do think there's a legitimate concern here.
> The AI can spit out the solution with no license and no attribution and somehow its legal.
Has that been properly adjudicated? That's what the AI companies and their fans wish, but wishing for something doesn't make it true.
The other day I had an agent write a parser for a niche query language which I will not name. There are a few open source implementations of this language on github, but none of them are in my target language and none of them are PEGs. The agent wrote a near perfect implementation of this query language in a PEG. I know that it looked at the implementations that were on github, because I told it to, yet the result is nothing like them. It just used them as a reference. Would and should this be a licensing issue (if they weren't MIT)?
To me, it's just further evidence that trying to assert ownership over a specific sequence of 1s and 0s is an entirely futile and meaningless endeavor.
I did have the thought that the SCOTUS ruling against Oracle slightly opened the door to code not being copyrightable (they deliberately tap-danced around the issue). Maybe that's the future: all code is plumbing; no art, no creative intent.
> The AI can spit out the solution with no license and no attribution and somehow its legal
Note that even MIT requires attribution.
If I include licensed code in a prompt and have a LLM include it in the output, is it still licensed?
At the end of the day it's up to the publisher of the work to attribute the sources that might end up in some commercial or public software derivative.
Do you give attribution to all the books, articles, etc. you've read?
Everything is a derivative work.
The models need to get burned down and retrained with these considerations baked in.
This is why its astonishing to me that AI has passed any legal department. I regularly see AI output large chunks of code that are 100% plagiarised from a project - its often not hard to find the original source by just looking up snippets of it. 100s of lines of code just completely stolen
Ai doesn't actually wash licenses, it literally can't. Companies are just assuming they're above the law