> something fundamental has changed that enables a computer to pretty effectively understand natural language.
You understand how the tech works right? It's statistics and tokens. The computer understands nothing. Creating "understanding" would be a breakthrough.
Edit: I wasn't trying to be a jerk. I sincerely wasn't. I don't "understand" how LLMs "understand" anything. I'd be super pumped to learn that bit. I don't have an agenda.
As someone who was an engineer on the original Copilot team, yes I understand how tech works.
You don’t know how your own mind “understands” something. No one on the planet can even describe how human understanding works.
Yes, LLMs are vast statistical engines but that doesn’t mean something interesting isn’t going on.
At this point I’d argue that humans “hallucinate” and/or provide wrong answers far more often than SOTA LLMs.
I expect to see responses like yours on Reddit, not HN.
We could use a little more kindness in discussion. I think the commenter has a very solid understanding on how computer works. The “understanding” is somewhat complex but I do agree with you that we are not there yet. I do think that the paradigm shift though is more about the fact that now we can interact with the computer in a new way.
The end effect certainly gives off "understanding" vibe. Even if method of achieving it is different. The commenter obviously didn't mean the way human brain understands
You understand how the brain works right? It's probability distributions mapped to sodium ion channels. The human understands nothing.
Birds and planes operate using somewhat different mechanics, but they do both achieve flight.
“You understand how the brain works right? It’s neurons and electrical charges. The brain understands nothing.”
I’m always struck by how confidently people assert stuff like this, as if the fact that we can easily comprehend the low-level structure somehow invalidates the reality of the higher-level structures. As if we know concretely that the human mind is something other than emergent complexity arising from simpler mechanics.
I’m not necessarily saying these machines are “thinking”. I wish I could say for sure that they’re not, but that would be dishonest: I feel like they aren’t thinking, but I have no evidence to back that up, and I haven’t seen non-self-referential evidence from anyone else.
"I don't "understand" how LLMs "understand" anything."
Why does the LLM need to understand anything. What today's chatbots have achieved is a software engineering feat. They have taken a stateless token generation machine that has compressed the entire internet's vocabulary to predict the next token and have 'hacked' a whole state management machinery around it. End result is a product that just feels like another human conversing with you and remembering your last birthday.
Engineering will surely get better and while purists can argue that a new research perspective is needed, the current growth trajectory of chatbots, agents and code generation tools will carry the torch forward for years to come.
If you ask me, this new AI winter will thaw in the atmosphere even before it settles on the ground.
Every time I see comments like these I think about this research from anthropic: https://www.anthropic.com/research/mapping-mind-language-mod...
LLMs activate similar neurons for similar concepts not only across languages, but also across input types. I’d like to know if you’d consider that as a good representation of “understanding” and if not, how would you define it?
You don't understand how the tech works, then.
LLMs aren't as good as humans at understanding, but it's not just statistics. The stochastic parrot meme is wrong. The networks create symbolic representations in training, with huge multidimensional correlations between patterns in the data, whether its temporal or semantic. The models "understand" concepts like emotions, text, physics, arbitrary social rules and phenomena, and anything else present in the data and context in the same fundamental way that humans do it. We're just better, with representations a few orders of magnitude higher resolution, much wider redundancy, and multi-million node parallelism with asynchronous operation that silicon can't quite match yet.
In some cases, AI is superhuman, and uses better constructs than humans are capable of, in other cases, it uses hacks and shortcuts in representations, mimics where it falls short, and in some cases fails entirely, and has a suite of failure modes that aren't anywhere in the human taxonomy of operation.
LLMs and AI aren't identical to human cognition, but there's a hell of a lot of overlap, and the stochastic parrot "ItS jUsT sTaTiStIcS!11!!" meme should be regarded as an embarrassing opinion to hold.
"Thinking" models that cycle context and systems of problem solving also don't do it the same way humans think, but overlap in some of the important pieces of how we operate. We are many orders of magnitude beyond old ALICE bots and MEgaHAL markov chains - you'd need computers the size of solar systems to run a markov chain equivalent to the effective equivalent 40B LLM, let alone one of the frontier models, and those performance gains are objectively within the domain of "intelligence." We're pushing the theory and practice of AI and ML squarely into the domain of architectures and behaviors that qualify biological intelligence, and the state of the art models clearly demonstrate their capabilities accordingly.
For any definition of understanding you care to lay down, there's significant overlap between the way human brains do it and the way LLMs do it. LLMs are specifically designed to model constructs from data, and to model the systems that produce the data they're trained on, and the data they model comes from humans and human processes.
It could very well be that statistics and tokens is how our brains work at the computational level too. Just that our algorithms have slightly better heuristics due to all those millennia of A/B testing of our ancestors.
I think it’s a disingenuous read to assume original commenter means “understanding” in the literal sense. When we talk about LLM “understanding”, we usually mean it from a practical sense. If you give an input to the computer, and it gives you an expected output, then colloquially the computer “understood” your input.
What do you mean by “understand”? Do you mean conscious?
Understand just means “parse language” and is highly subjective. If I talk to someone African in Chinese they do not understand me but they are still conscious.
If I talk to an LLM in Chinese it will understand me but that doesn’t mean it is conscious.
If I talk about physics to a kindergartner they will not understand but that doesn’t mean they don’t understand anything.
Do you see where I am going?
[dead]
It astonishes me how people can make categorical judgements on things as hard to define as 'understanding'.
I would say that, except for the observable and testable performance, what else can you say about understanding?
It is a fact that LLMs are getting better at many tasks. From their performance, they seem to have an understanding of say python.
The mechanistic way this understanding arises is different than humans.
How can you say then it is 'not real', without invoking the hard problem of consciousness, at which point, we've hit a completely open question.