Δ-Mem: Efficient Online Memory for Large Language Models

135 points • by 44za12 • today at 9:30 AM • 32 comments • view on HN

Comments

Hmm, this is a case where HN’s title mangling changed the meaning of the title. Lower case delta (δ) is used intentionally. I don’t think HN should automatically modify the casing of non-ascii chars.

➕ show 3 replies

usernametaken29 • today at 11:16 AM

> δ-mem compresses past information into a fixed-size state matrix updated by delta-rule learning

This doesn’t solve the capacity problem of memory. You can cram more into one context window, but then again you need to associate them with input queries. That’s very hard because slight variations in input create hugely different activations. So really, it doesn’t improve caching. This paper might do a thing or two approximating the compression limit for context windows, but there’s a fundamental limit on how much information can go into it. What you really need is contextual search, as in, different events and objects with the same abstractions and semantic lead to same response, so you can cache effectively… on this front the paper does little to improve “memory” in a meaningful way

3form • today at 10:40 AM

Interesting points:

- fixed size of the memory seems like a good idea to overcome the current limitations

- skimming through the thing, I can't find any mention of the cost?

- I would need more time to read it in-depth to see if this is legitimate and not just fancy form of overfitting or training on testing data

raverbashing • today at 11:31 AM

Interesting that the headline is showing Δ-Mem while the paper uses δ-mem

Is it a lowercase to uppercase conversion going on here?

➕ show 1 reply

ktallett • today at 10:23 AM

The obvious energy saving step would be to utilise previous searches by others. Many of the tasks people do are rather similar, it is such an energy waste to start again each time.

(Obviously ignoring the huge energy saver, which is to observe if you even need to bother doing the task at all.)

➕ show 2 replies

DeathArrow • today at 10:23 AM

I see lots of techniques proposed to give LLM the capacity to recall things, I even saw a lot of memory plugins for AI coding agents, I tried some myself.

What I want to see is something that was tested and proved in practice to be genuinely useful, especially for coding agents.

➕ show 1 reply

zhenglei11 • today at 10:40 AM

[flagged]

cubefox • today at 12:30 PM

Papers being voted high on Hacker News are usually uncorrelated with their actual importance. It's basically a lottery. There are regularly more interesting papers going semi viral on Twitter.

➕ show 2 replies

belabartok39 • today at 12:56 PM

Did AI generate this paper too?

alt Hacker News

Δ-Mem: Efficient Online Memory for Large Language Models

Comments