>* For years, despite functional evidence and scientific hints accumulating, certain AI researchers continued to claim LLMs were stochastic parrots: probabilistic machines that would: 1. NOT have any representation about the meaning of the prompt. 2. NOT have any representation about what they were going to say. In 2025 finally almost everybody stopped saying so.
Man, Antirez and I walk in very different circles! I still feel like LLMs fall over backwards once you give them an 'unusual' or 'rare' task that isn't likely to be presented in the training data.
"In 2025 finally almost everybody stopped saying so."
I haven't.
I don’t think this is quite true.
I’ve seen them do fine on tasks that are clearly not in the training data, and it seems to me that they struggle when some particular type of task or solution or approach might be something they haven’t been exposed to, rather than the exact task.
In the context of the paragraph you quoted, that’s an important distinction.
It seems quite clear to me that they are getting at the meaning of the prompt and are able, at least somewhat, to generalise and connect aspects of their training to “plan” and output a meaningful response.
This certainly doesn’t seem all that deep (at times frustratingly shallow) and I can see how at first glance it might look like everything was just regurgitated training data, but my repeated experience (especially over the last ~6-9 months) is that there’s something more than that happening, which feels like whet Antirez was getting at.
LLMs certainly struggle with tasks that require knowledge that is not provided to them (at significant enough volume/variance to retain it). But this is to be expected of any intelligent agent, it is certainly true of humans. It is not a good argument to support the claim that they are Chinese Rooms (unthinking imitators). Indeed, the whole point of the Chinese Room thought experiment was to consider if that distinction even mattered.
When it comes to of being able to do novel tasks on known knowledge, they seem to be quite good. One also needs to consider that problem-solving patterns are also a kind of (meta-)knowledge that needs to be taught, either through imitation/memorisation (Supervised Learning) or through practice (Reinforcement Learning). They can be logically derived from other techniques to an extent, just like new knowledge can be derived from known knowledge in general, and again LLMs seem to be pretty decent at this, but only to a point. Regardless, all of this is definitely true of humans too.