It's sad how many people are falling for the narrative that there's more at play here than predict-next-token and some kind of emergent intelligence is happening.
No, that is just your interpretation of what you see as something that can't possibly be just token prediction.
And yet it is. It's the same algorithm noodling over incredible amounts of tokens.
And that's exactly the explanation: People regularly underestimate how much training data is being used for LLMs. They contain everything about writing a compiler, toy examples, full examples, recommended structure yadda yadda yadda.
I love working with Claude and it regularly surprises me but that doesn't mean I think it is intelligent.
It's sad how many people are falling for the narrative that there's more at play here than predict-next-token and some kind of emergent intelligence is happening.
No, that is just your interpretation of what you see as something that can't possibly be just token prediction.
And yet it is. It's the same algorithm noodling over incredible amounts of tokens.
And that's exactly the explanation: People regularly underestimate how much training data is being used for LLMs. They contain everything about writing a compiler, toy examples, full examples, recommended structure yadda yadda yadda.
I love working with Claude and it regularly surprises me but that doesn't mean I think it is intelligent.