logoalt Hacker News

_heimdall12/09/20241 replyview on HN

I think the underlying question there, and one I don't have a solid answer for, is whether ChatGPT is considered to be scraping the underlying recipe or the webpage itself and all the content that goes along with it. The recipe may be centuries old potentially, but the page, content, images, etc are all content created and owned by the site creator

Edit: for a better example - Brothers Grimm stories aren't protected, but if someone makes a movie based on those stories the movie absolutely protected.


Replies

josephg12/09/2024

I think the real question is this: Is chatgpt "just copying" the content in its training set? What constitutes plagerism, exactly?

If ChatGPT is reproducing content verbatim from its training set, then I think the claim its violating copyright holds a lot of water. (And I think there was a NYT lawsuit claiming such - and I wish them well).

But if chatgpt learns from 100 recipes for bechamel sauce, and synthesizes them into its own, totally original description, then I don't see how what its doing is any different from what the authors of those recipe books & websites are doing. If anything, its probably synthesizing a lot more sources than any recipe author. If the only common factor between chatgpt's output and any specific source is the (public domain) recipe itself then that seems ethically in the clear to me.

I can't see a justification to criminalise what chatgpt is doing with recipes, without casting so wide a net as to open recipe authors up for persecution in the same way.

Scraping a website isn't illegal. When humans do it, we call it browsing the web.

show 1 reply