One of the big problems with Attention Mechanisms is that the Query needs to look over every single ...

enjeyw • last Thursday at 4:37 AM • 1 reply • view on HN

One of the big problems with Attention Mechanisms is that the Query needs to look over every single key, which for long contexts becomes very expensive.

A little side project I've been working on is to train a model that sits on top of the LLM, looks at each key and determines whether it's needed after a certain lifespan, and evicts it if possible (after the lifespan is expired). Still working on it, but my first pass test has a reduction of 90% of the keys!

https://github.com/enjeyw/smartkv

Replies

krackers • yesterday at 2:35 AM

Is this not similar to DeepSeek lighting indexer

alt Hacker News

Replies