> The Transformer architecture revolutionized sequence modeling with its introduction of attention, a mechanism by which models look back at earlier inputs to prioritize relevant input data
I've always wanted to read how something like Cursor manages memory. It seems to have developed a long history of all of prompts and understands both the codebase and what I'm building slightly more over time, causing less errors.
That's not what they are talking about here. This is just a description of what goes on with a transformer and the context window