> Hmm, I'm not convinced that is the direction we want to go in. It's not like we have all the context of everything we ever learned present when making decisions.
I do not think it is the direction for everything.
Generally, we need consolidation of experiences and memories to just remember the important conclusions, ideas, and concepts, and then the ability to remember the full details if they are relevant (which they usually are not.)
But for some applications I am sure a billion token context would be useful.
It is likely most people need a 10 core CPU or whatever for most tasks, but for some applications you want a supercomputer with 1M cores.
I think we are wending toward a solution here for context, because no matter how big a context window is, there needs to be a way to navigate and prioritize that context, a way to handle contadictory info, etc.
So we need a taxonomy, we need memory layers, we need summary/details. If there is one thing I have learned about how these LLMs work, if you give them a few flexible tools they can work the shit out of them to achieve objectives. We just need to right tools and right structure for context.