i dont think this is a meaningful distinction.
it knows the past tokens because theyre part of the input for predicting the next token. its part of the model architecture that it knows it.
if that isnt knowing, people dont know how to walk, only how to move limbs, and not even that, just a bunch of neurons firing
It doesn't know if it produced that token itself or if someone else did.
How close are you to saying that a repair manual "knows" how to fix your car? I think the conversation here is really around word choice and anthropomorphization.