logoalt Hacker News

observationistyesterday at 5:19 PM0 repliesview on HN

Respectfully, you're not completely wrong, but you are making some mistaken assumptions about the operation of LLMs.

Transformers allow for the mapping of a complex manifold representation of causal phenomena present in the data they're trained on. When they're trained on a vast corpus of human generated text, they model a lot of the underlying phenomena that resulted in that text.

In some cases, shortcuts and hacks and entirely inhuman features and functions are learned. In other cases, the functions and features are learned to an astonishingly superhuman level. There's a depth of recursion and complexity to some things that escape the capability of modern architectures to model, and there are subtle things that don't get picked up on. LLMs do not have a coherent self, or subjective central perspective, even within constraints of context modifications for run-time constructs. They're fundamentally many-minded, or no-minded, depending on the way they're used, and without that subjective anchor, they lack the principle by which to effectively model a self over many of the long horizon and complex features that human brains basically live in.

Confabulation isn't unique to LLMs. Everything you're saying about how LLMs operate can be said about human brains, too. Our intelligence and capabilities don't emerge from nothing, and human cognition isn't magical. And what humans do can also be considered "intelligent autocomplete" at a functional level.

What cortical columns do is next-activation predictions at an optimally sparse, embarrassingly parallel scale - it's not tokens being predicted but "what does the brain think is the next neuron/column that will fire", and where it's successful, synapses are reinforced, and where it fails, signals are suppressed.

Neocortical processing does the task of learning, modeling, and predicting across a wide multimodal, arbitrary depth, long horizon domain that allow us to learn words and writing and language and coding and rationalism and everything it is that we do. We're profoundly more data efficient learners, and massively parallel, amazingly sparse processing allows us to pick up on subtle nuance and amazing wide and deep contextual cues in ways that LLMs are structurally incapable of, for now.

You use the word hallucinations as a pejorative, but everything you do, your every memory, experience, thought, plan, all of your existence is a hallucination. You are, at a deep and fundamental level, a construct built by your brain, from the processing of millions of electrochemical signals, bundled together, parsed, compressed, interpreted, and finally joined together in the wonderfully diverse and rich and deep fabric of your subjective experience.

LLMs don't have that, or at best, only have disparate flashes of incoherent subjective experience, because nothing is persisted or temporally coherent at the levels that matter. That could very well be a very important mechanism and crucial to overcoming many of the flaws in current models.

That said, you don't want to get rid of hallucinations. You want the hallucinations to be valid. You want them to correspond to reality as closely as possible, coupled tightly to correctly modeled features of things that are real.

LLMs have created, at superhuman speeds, vast troves of things that humans have not. They've even done things that most humans could not. I don't think they've done things that any human could not, yet, but the jagged frontier of capabilities is pushing many domains very close to the degree of competence at which they'll be superhuman in quality, outperforming any possible human for certain tasks.

There are architecture issues that don't look like they can be resolved with scaling alone. That doesn't mean shortcuts, hacks, and useful capabilities won't produce good results in the meantime, and if they can get us to the point of useful, replicable, and automated AI research and recursive self improvement, then we don't necessarily need to change course. LLMs will eventually be used to find the next big breakthrough architecture, and we can enjoy these wonderful, downright magical tools in the meantime.

And of course, human experts in the loop are a must, and everything must be held to a high standard of evidence and review. The more important the problem being worked on, like a law case, the more scrutiny and human intervention will be required. Judges, lawyers, and politicians are all using AI for things that they probably shouldn't, but that's a human failure mode. It doesn't imply that the tools aren't useful, nor that they can't be used skillfully.