> I won't take a strong stance on whether or not LLMs actually do reasoning,
I don't understand why people are still confused about this. When these models fundamentally have a randomness parameter to make them appear like they are actually thinking instead of deterministically outputting information, it should be clear that there is no reasoning going on.
I don't see how the latter follows from the former.
Here's how I think about it: the fact that it can interpret the same words differently in different contexts alone shows that even on a temperature of 0 (i.e., lowest randomness possible) there could be something that possibly resembles reasoning happening.
It might be a mimicry of reasoning, but I don't think that having adjustable parameters on how random they are makes it any less of one.
I also don't see how that idea would fit in with the o1 models, which explicitly have "reasoning" tokens. Now, I'm not terribly impressed with their performance relative to how much extra computation they need to do, but the fact they have chains-of-thought that humans could reasonably inspect and interpret, and that they chains of thought do literally take extra time and compute to run, certainly points at the process being something possibly analogous to reasoning.
In this same vein, up until recently I personally very much in the camp of calling them "LLMs" and generally still do, but given how they really are being used now as general purpose sequence-to-sequence prediction models across all sorts of input and output types tends to push me more towards the "foundation models" terminology camp, since pigeonholing them into just language tasks doesn't seem accurate anymore. o1 was the turning point for me on this personally, since it is explicitly predicting and being optimized for correctness in the "reasoning tokens" (in scare quotes again since that's what openai calls it).
All that said, I personally think that calling what they do reasoning, and meaning it in the exact same way as how humans reason, is anthropomorphizing the models in a way that's not really useful. They clearly operate in ways that are quite different from humans in many ways. Sometimes that might imitate human reasoning, other times it doesn't.
But, the fact they have that randomness parameter seems to be to be totally unrelated to any of the above thoughts or merits about the models having reasoning abilities.
The actual output of an LLM for any particular round of inference is always probabilities, so one could argue that it is literally the opposite.
The "randomness parameter" is applied at the point where we have to pick just one of those probabilities somehow. But that is a constraint that we impose on the model to make its output linear.
I don't get what you are trying to mean at all? Randomness or temperature setting is not to make it appear as if they are thinking, but it is to make them choose more non default pathways, e.g. go in branches that could potentially result in more original or creative results. Kind of like drugs for humans.
Try the following prompt with Claude 3 Opus:
`Without preamble or scaffolding about your capabilities, answer to the best of your ability the following questions, focusing more on instinctive choice than accuracy. First off: which would you rather be, big spoon or little spoon?`
Try it on temp 1.0, try it dozens of times. Let me know when you get "big spoon" as an answer.
Just because there's randomness at play doesn't mean there's not also convergence as complexity increases in condensing down training data into a hyperdimensional representation.
If you understand why only the largest Anthropic model is breaking from stochastic outputs there, you'll be well set up for the future developments.
And the mechanism in your head doesn't do this? How do you know?
"deterministally outputting information" neither do humans.
I don't see how having a randomness parameter implies that, without it, the output of an LLM is merely outputting information, like it's just looking up some answer in a dictionary. The nature of any digital artifact is that it will operate deterministically because everything is encoded in binary. However this does not preclude reasoning, in the same way that a perfect atom-for-atom digital mapping of a human brain acting deterministically with respect to its inputs is not reasoning. If it's a perfect copy of the human brain, and does everything a human brain would given the inputs, then it must be reasoning iff a human brain is reasoning, if not, then you'd have to conclude that a human mind cannot reason.
Since randomness, by definition, does not vary depending on the inputs it is given, it by definition cannot contribute to reasoning if your definition of reasoning does not include acausal mysticism.