logoalt Hacker News

coppsilgoldtoday at 7:51 PM0 repliesview on HN

There are also interesting approaches to more directly compress a large document or an entire codebase into a smaller set of tokens without getting the LLM to wing it. For example, Cartridges: <https://hazyresearch.stanford.edu/blog/2025-06-08-cartridges>

They basically get gradient descent to optimize the KV cache while freezing the network.