LLMs definitely also have finite context length. And if we consider padding, it is also constant. The k is huge compared to most Markov chains used historically, but it doesn't make it less finite.
That's not correct. Even a toy like an exponential weighted moving averaging produces unbounded context (of diminishing influence).
That's not correct. Even a toy like an exponential weighted moving averaging produces unbounded context (of diminishing influence).