> Because AI is not intelligent, it doesn't "know" what it previously output even a token ago.
Of course it knows what it output a token ago, that's the whole point of attention and the whole basis of the quadratic curse.
> Of course it knows what it output a token ago...
It doesn't know anything. It has a bunch of weights that were updated by the previous stuff in the token stream. At least our brains, whatever they do, certainly don't function like that.
> Of course it knows what it output a token ago...
It doesn't know anything. It has a bunch of weights that were updated by the previous stuff in the token stream. At least our brains, whatever they do, certainly don't function like that.