I believe you're misunderstanding what the OP means about "long-term" memory. From wh...

soerxpso • last Sunday at 7:36 PM • 1 reply • view on HN

I believe you're misunderstanding what the OP means about "long-term" memory. From what I can tell, it's not actively modifying the weights of the underlying model, it just "remembers" things from a high number of tokens into the past of its context. The point is that this allows it to remember something it read ~200 pages ago in a very long context window, not that it can remember something from one session into another clean session.

Replies

AlexCoventry • last Monday at 12:09 AM

This model has fast weights, which actually are modified during inference.

➕ show 1 reply

alt Hacker News

Replies