logoalt Hacker News

drivebyhootingyesterday at 5:46 PM24 repliesview on HN

In my opinion LLMs are intellectual property theft. Just as if I started distributing copies of books. This substantially reduces the incentive for the creation of new IP.

All written text, art work, etc needs to come imbued with a GPL style license: if you train your model on this, your weights and training code must be published.


Replies

theropostyesterday at 6:18 PM

I think there is a real issue here, but I do not think it is as simple as calling it theft in the same way as copying books. The bigger problem is incentives. We built a system where writing docs, tutorials, and open technical content paid off indirectly through traffic, subscriptions, or services. LLMs get a lot of value from that work, but they also break the loop that used to send value back to the people and companies who created it.

The Tailwind CSS situation is a good example. They built something genuinely useful, adoption exploded, and in the past that would have meant more traffic, more visibility, and more revenue. Now the usage still explodes, but the traffic disappears because people get answers directly from LLMs. The value is clearly there, but the money never reaches the source. That is less a moral problem and more an economic one.

Ideas like GPL-style licensing point at the right tension, but they are hard to apply after the fact. These models were built during a massive spending phase, financed by huge amounts of capital and debt, and they are not even profitable yet. Figuring out royalties on top of that, while the infrastructure is already in place and rolling out at scale, is extremely hard.

That is why this feels like a much bigger governance problem. We have a system that clearly creates value, but no longer distributes it in a sustainable way. I am not sure our policies or institutions are ready to catch up to that reality yet.

show 6 replies
senkotoday at 10:35 AM

I support your right to have an opinion, but in my opinion, thank God this is just your opinion.

Copyright, as practiced in late 20 and this century, is a tool for big corps to extract profits from actual artists, creators, and consumers of this art[0] equally. Starving artists do not actually benefit.

Look at Spotify (owned and squeezed by record labels) giving 70% of the revenue to the record labels, while artists get peanuts. Look at Disney deciding it doesn't need to pay royalties to book writers. Hell, look at Disney's hits from Snow White onwards, and then apply your "LLMs are IP theft" logic to that.

Here's what Cory Doctorow, a book author and critic of AI, has to say about it in [1]:

> So what is the alternative? A lot of artists and their allies think they have an answer: they say we should extend copyright to cover the activities associated with training a model.

> And I'm here to tell you they are wrong: wrong because this would inflict terrible collateral damage on socially beneficial activities, and it would represent a massive expansion of copyright over activities that are currently permitted – for good reason!.

---

> All written text, art work, etc needs to come imbued with a GPL style license

GPL-style license has been long known not to work well for artifacts other than code. That's the whole reason for existence of Creative Commons, GNU Free Documentation License, and others.

[0] "consumers of art" sounds abhorrent, yet that's exactly what we are [1] https://pluralistic.net/2025/12/05/pop-that-bubble/

show 1 reply
iterateoftenyesterday at 6:44 PM

I stopped writing open source projects on github because why put a bunch of work into something for others to train off of without any regard for the original projects

show 5 replies
cogman10yesterday at 6:03 PM

> if you train your model on this, your weights and training code must be published.

The problem here is enforcement.

It's well known that AI companies simply pirated content in order to train their models. No amount of license really helps in that scenario.

show 2 replies
ralph84yesterday at 7:25 PM

We already have more IP than any human could ever consume. Why do we need to incentivize anything? Those who are motivated by the creation itself will continue to create. Those who are motivated by the possibility of extracting rent may create less. Not sure that's a bad thing for humanity as a whole.

johnpaulkiseryesterday at 5:52 PM

> if you train your model on this, your weights and training code must be published.

This feels like the simplest & best single regulation that can be applied in this industry.

show 3 replies
hk__2yesterday at 6:18 PM

Do I have to publish my book for free because I got inspiration from 100's of other books I read during my life?

show 5 replies
baxtryesterday at 7:51 PM

This is one way to look at it.

The other way is to argue that LLMs democratize access to knowledge. Anyone has access to all ever written by humanity.

Crazy impressive if you ask me.

show 4 replies
qserayesterday at 5:52 PM

>This substantially reduces the incentive for the creation of new IP.

Not all, but some kind of IP.

Some of those that is created for sake of creating it and nothing else.

show 1 reply
journaltoday at 3:56 AM

If I was able to memorize every pixel value to reconstruct a movie from memory, would that be theft?

show 1 reply
kubanczykyesterday at 7:15 PM

> imbued with a GPL style license

GPL died. Licenses died.

Exnation: LLMs were trained also on GPL code. The fact that all the previously-paranoid businesses that used to warn SWEs not to touch GPL code with a ten foot pole are now fearlessly embracing LLMs' outputs, means that de facto they consider an LLM their license-washing machine. Courts are going to rubber stamp it because billions of dollars, etc.

paraditetoday at 5:19 AM

By your analogy human brains as also IP thefts, because they ingest what's available in the world, mix and match them, and synthesize slightly different IPs based on them.

Animatsyesterday at 9:04 PM

Education can be viewed as intellectual property theft. There have been periods in history when it was. "How to take an elevation from a plan" was a trade secret of medieval builders and only revealed to guild members. How a power loom works was export-controlled information in the 1800s, and people who knew how a loom works were not allowed to emigrate from England.

The problem is that LLMs are better than people at this stuff. They can read a huge quantity of publicly available information and organize it into a form where the LLM can do things with it. That's what education does, more slowly and at greater expense.

timcobbtoday at 4:48 AM

> This substantially reduces the incentive for the creation of new IP.

You say that like it's a bad thing...

bwfan123yesterday at 7:03 PM

> This substantially reduces the incentive for the creation of new IP

And as a result of this, the models will start consuming their own output for training. This will create new incentives to promote human generated code.

venndeezlyesterday at 8:26 PM

In my opinion information wants to be free. It's wild to me seeing the tech world veer into hyper-capitalism and IP protectionism. Complete 180 from the 00s.

IMO copyright laws should be rewritten to bring copyright inline with the rest of the economy.

Plumbers are not claiming use fees from the pipes they installed a decade ago. Doctor isn't getting paid by a 70 year old for saving the 70 year old when they were in a car accident at age 50.

Why should intellectual property authors be given extreme ownership over behavior then?

In the Constitution Congress is allowed to protect with copyright "for a limited time".

The status quo of life of author + 99 years means works can be copyrighted for many peoples entire lives. In effect unlimited protection.

Why is society on the hook to preserve a political norm that materially benefits so few?

Because the screen tells us the end is nigh! and giant foot will crush us! if we move on from old America. Sad and pathetic acquiescence to propaganda.

My fellow Americans; must we be such unserious people all the time?

This hypernormalized finance engineered, "I am my job! We make line go up here!" culture is a joke.

show 3 replies
tazjinyesterday at 5:58 PM

Does anyone know of active work happening on such a license?

show 3 replies
CrimsonRaintoday at 8:34 AM

How about first you address the IP theft humans have been performing to create the IPs you are talking about?

How about humans which are remembering your book?

Your opinion is shit.

AlienRobotyesterday at 10:04 PM

It's already imbued with copyright infringement if you copy it without a license.

GrowingSidewaystoday at 12:17 AM

Intellectual property was kind of a gimmick to begin with, though. Let's not pretend like copyright and patents made any sense to begin with

show 1 reply
mrcwinnyesterday at 6:33 PM

Commercialization may be a net good for open source, in that it helps sustain the project’s investment, but I don’t think that means that you’re somehow entitled to a commercial business just because you contributed something to the community.

The moment Tailwind becomes a for-profit, commercial business, they have to duke it out just like anyone else. If the thing you sell is not defensible, it means you have a brittle business model. If I’m allowed to take Tailwind, the open source project, and build something commercial around it, I don’t see why OpenAI or Anthropic cannot.

show 1 reply
wnjenrbryesterday at 6:53 PM

In my opinion, IP is dead. Strong IP died in 2022, along with the Marxist labor theory of value; of which IP derives its (hypothetical) value. It no longer matters who did what when and how. The only thing that matters is that exists, and it can be consumed, for no cost, forever.

IP is the final delusion of 19th century thinking. It was crushed when we could synthesize anything, at little cost, little effort. Turns out, the hard work had to be done once, and we could automate to infinity forever.

Hold on to 19th century delusions if you wish, the future is accelerating, and you are going to be left behind.

show 1 reply
blitz_skullyesterday at 9:18 PM

The idea of being able to “steal” ideas is absolutely silly.

Yeah we’ve got a legal system for it, but it always has been and always will be silly.

show 1 reply