> My intention is to highlight the fact that LLM conversations are cleverly disguised examples of...

Nevermark • yesterday at 11:37 PM • 11 replies • view on HN

> My intention is to highlight the fact that LLM conversations are cleverly disguised examples of sentence continuation

Regardless of bigger issues, this kind of statement reveals a deep misunderstanding.

Problem type does not limit problem complexity. Nor does problem type limit solution complexity or power.

If a machine has to learn to understand humans to complete text, then that is what it has to do. And there is no theoretical or practical basis for suggesting that this is somehow "faking" understanding, just because of the form of original data streaming in and out.

Neither problem type, nor input/output structure, limit internal representations.

Understanding is learned from patterns in the data, not the gross form of the data. Does the data require an understanding of something to complete the task? Then that understanding will be what is optimized.

To the degree they are limited, it is for other reasons. Resources such as computing, parameter number, lack of representative data, ... Which in the cases of SOTA models, we know are not limits. A conclusion verified by the models' actual abilities.

Replies

lgessler • today at 12:52 AM

Raphaël Millière has a very useful term for this kind of vacuous dismissal, the redescription fallacy (https://arxiv.org/pdf/2401.03910, page 9):

> Recent debates have been clouded by a misleading inference pattern, which we term the “Redescription Fallacy.” This fallacy arises when critics argue that a system cannot model a particular cognitive capacity, simply because its operations can be explained in less abstract and more deflationary terms. In the present context, the fallacy manifests in claims that LLMs could not possibly be good models of some cognitive capacity because their operations merely consist in a collection of statistical calculations, or linear algebra operations, or next-token predictions. Such arguments are only valid if accompanied by evidence demonstrating that a system, defined in these terms, is inherently incapable of implementing . To illustrate, consider the flawed logic in asserting that a piano could not possibly produce harmony because it can be described as a collection of hammers striking strings, or (more pointedly) that brain activity could not possibly implement cognition because it can be described as a collection of neural firings. The critical question is not whether the operations of an LLM can be simplistically described in non-mental terms, but whether these operations, when appropriately organized, can implement the same processes or algorithms as the mind, when described at an appropriate level of computational abstraction.

➕ show 2 replies

Isamu • today at 2:16 AM

>If a machine has to learn to understand humans to complete text, then that is what it has to do.

A language model completes text based on the overlapping patterns of the training data.

There absolutely was thinking involved… in the training data. Same as when you read a book, you engage with the thinking behind the text. The book isn’t thinking, and the author may be dead and gone, but there’s absolutely the traces of thinking in the text.

Language models produce mashups of texts they were trained on, and there’s absolutely the traces of thoughts behind those mashups.

cauch • today at 12:10 AM

I think, for me, the thing is that when you do basic ML, you discover that ML will very often find data pattern that fit the goal but does not correspond to a real mechanism.

So, I think there is a flaw in the logic of saying that human text have a pattern of "consciousness mechanism" and therefore LLM will learn "consciousness mechanism" in order to return sentence continuation that is convincing. There is probably tons of data pattern that LLM can learn from to be able to reproduce a sentence continuation that is convincing without having to learn the specific mechanism that is "conscious".

For me, one element that shows it is the case is the absence of world model (or "human-like" world model) despite the fact that the sentence continuation is convincing. If indeed the only way to produce sentence continuation convincingly would be by "simulating a brain", then it would not explain the first LLM from several years ago (before the extra layers of RLHF, ...). They were able to have quite convincing conversation on a lot of non-trivial aspect, and yet failed on some aspects that should have been basic for a system that would have been trained to work like a human brain. It shows that it is possible to "cleverly disguise examples of sentence continuation" without having to build elements that one expect on a conscious being.

➕ show 4 replies

dogwalker5000 • today at 1:58 AM

> If a machine has to learn to understand humans to complete text, then that is what it has to do. And there is no theoretical or practical basis for suggesting that this is somehow "faking" understanding, just because of the form of original data streaming in and out.

I think the main complaint is LLMs don’t arrive at the answer the way we do. It’s capable of emulating some of our behavior but not all as the mechanism by which it works is very different.

Maybe I’m wrong about this but one thing humans do that LLMs don’t is deductive reasoning. LLMs seem to operate entirely of inductive reasoning.

➕ show 1 reply

hn_acc1 • today at 12:40 AM

I would maybe agree with you if the entire realm of human existence was limited to words. There are many human experiences that transcend text, and indeed can hardly be adequately described using text.

Sure, it's the best we have online, but that does not make "the internet" the sum of all human experience. To reduce all of humanity down to the text on the internet is reducing us to the level of machines to fit the requirement of what a machine can process / simulate.

➕ show 3 replies

Lerc • today at 1:59 AM

>To the degree they are limited, it is for other reasons. Resources such as computing, parameter number, lack of representative data, ...

This is where the other claim is being made. That the structure of the model is fundamentally incapable of the operation, so even if you stipulated that the way you provide data is sufficient for intelligence then it still wouldn't work.

The universal approximation theorem addresses this point. In that, with an identity attention mechanism, a LLM is just a multi layer perceptron. The attention mechanism is effectively a way to get one of the benefits of a much larger fully connected layer without the massive cost.

A LLM can do what a MLP can do. A large enough MLP can do any function to arbitrary precision.

That makes the claim that an LLM could not do a task the same as saying no function can do that task.

Some are ok with this, if you invoke some supernatual aspect to intelligence then the inability to describe it with a function is quite reasonable,

If you want to stay in the world of reality, you have a much harder task, people like to point at quantum (Penrose) but it's hard to say what it is you are pointing at.

I think the very act of proving that something is or is not intelligent, would render it functional by nature of it having a proof, (or disprove Gödel's incompleteness (a tough ask))

Are there any proofs that cannot be expressed as a function? A kind of Gödel locator, where you can prove something that you can identify is true but there is no formula to express it. I'm not entirely sure what that would even mean,

slashdave • today at 12:46 AM

> of the form of original data streaming in and out.

Except this is not consciousness.

➕ show 1 reply

krupan • today at 12:00 AM

"If a machine has to learn to understand humans to complete text, then that is what it has to do."

But the machine doesn't have to understand humans to do that. It gets trained on a whole bunch of sentences and then it is able to complete text. You could maybe claim that it "understands" the text but even that's a stretch.

➕ show 1 reply

qarl • yesterday at 11:54 PM

Yeah. There are good arguments against LLM consciousness. This is not one of them.

I'm hearing a lot of bad arguments against LLM consciousness lately. Bad argumentation heralds bad outcomes.

➕ show 1 reply

tsunamifury • today at 12:45 AM

Come on, I invented parts of this technology at Google and am baffled why this is debated.

We discovered math that decodes data storage in langauge and is able to use sophisticated continuation cohorts from ALL OF HUMAN RECORDED KNOWLEDGE to respond to you in a call/response model with very good synthesis capabilities.

Its super useful, but not life or conciousness. Its a simulated echo from our collective recorded behaviors. It understands because we understood first. It replies because we wrote it first. And it sorts, organizes, synthesizes and compresses that at impressive speed now.

➕ show 2 replies

calf • today at 12:34 AM

His intention is irrelevant, as is "trying to highlight a fact" as if it were the final say: all Chiang is doing here is using fancy white-collar words to argue the same argument leveled against Hinton and others regarding next-token prediction. And his audience, who have even less technical understanding, lap it all up unawares. Chiang is a writer and needs to stay his own lane, not RP as an expert; or, if he wants to do journalism on this topic then he should actually do the work and talk to more actual experts not just the ones cherrypicked for his opinion piece.

➕ show 1 reply

alt Hacker News

Replies