That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.

bsaul • last Sunday at 9:43 PM • 7 replies • view on HN

Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).

Replies

jacquesm • last Sunday at 9:56 PM

We can solve that question in an intuitive way: if human input is not what is driving the output then it would be sufficient to present it with a fraction of the current inputs, say everything up to 1970 and have it generate all of the input data from 1970 onwards as output.

If that does not work then the moment you introduce AI you cap their capabilities unless humans continue to create original works to feed the AI. The conclusion - to me, at least - is that these pieces of software regurgitate their inputs, they are effectively whitewashing plagiarism, or, alternatively, their ability to generate new content is capped by some arbitrary limit relative to the inputs.

➕ show 5 replies

martin-t • last Monday at 12:43 AM

The whole "reproduces training data vebatim" is a red herring.

It reproduces _patterns from the training data_, sometimes including verbatim phrases.

The work (to discover those patterns, to figure out what works and what does not, to debug some obscure heisenbug and write a blog post about it, ...) was done by humans. Those humans should be compensated for their work, not owners of mega-corporations who found a loophole in copyright.

fpoling • last Sunday at 10:19 PM

Pick up a book about programming from seventies or eighties that was unlikely to be scanned and feed into LLM. Take a task from it and ask LLM to write a program from it that even a student can solve within 10 minutes. If the problem was not really published before, LLM fails spectacularly.

➕ show 4 replies

checkmatez • last Monday at 11:20 AM

> that LLM are fundamentally unable to produce original code.

What about humans? Are humans capable of producing completely original code or ideas or thoughts?

As the saying goes, if you want to create something from scratch, you have to start by inventing the universe.

Human mind works by noticing patterns and applying them in different contexts.

_heimdall • last Sunday at 11:58 PM

I have a very anecdotal, but interesting, counterexample.

I recently asked Gemini 3 Pro to create an RSS feed reader type of experience by using XSLT to style and layout an OPML file. I specifically wanted it to use a server-side proxy for CORS, pass through caching headers in the proxy to leverage standard HTTP caching, and I needed all feed entries for any feed in the OPML to be combined into a single chronological feed.

It initially told multiple times that it wasn't possible (it also reminded me that Google is getting rid of XSLT). Regardless, after reiterating that it is possible multiple times it finally decided to make a temporary POC. That POC worked on the first try, with only one follow up to standardize date formatting with support for Atom and RSS.

I obviously can't say the code was novel, though I would be a bit surprised if it trained on that task enough for it to remember roughly the full implementation and still claimed it was impossible.

➕ show 1 reply

checker659 • last Monday at 5:52 AM

I think the burden of proof is on the people making the original claim (that LLMs are indeed spitting out original code).

moron4hire • last Monday at 1:58 AM

No, the thing needing proof is the novel idea: that LLMs can produce original code.

➕ show 3 replies

alt Hacker News

Replies