Towards Memory Specialization: A Case for Long-Term and Short-Term RAM

43 points • by PaulHoule • yesterday at 8:05 PM • 27 comments • view on HN

Comments

In the microcontroller world, there's already asymmetric RAM like this, although it's all based on the same (SRAM) technology, and the distinction is around the topology. You have TCM directly coupled to the core, then you generally have a few SRAM blocks attached to an AXI cross-bar (so that if software running on different µc cores don't simultaneously access the same block, you have non-interference on timing; but simultaneous access is allowed at the cost of known timing), and then a few more SRAM blocks attached a couple of AXI bridges away (from the point of view of a core; for example, closer to a DMA engine, or a low power core, or another peripheral that masters the bus). You can choose to ignore this, but for maximum performance and (more importantly) maximum timing determinism, understanding what is in which block is key. And that's without getting into EMIFs and off-chip SRAM and DRAM, or XIP out of various NVM technologies...

Grosvenor • yesterday at 9:25 PM

I'll put the Tandem 5 minute rule paper here, it seems very relevant.

https://dsf.berkeley.edu/cs286/papers/fiveminute-tr1986.pdf

and a revisit of the rule 20 years later (It still held).

https://cs-people.bu.edu/mathan/reading-groups/papers-classi...

Animats • yesterday at 8:51 PM

What they seem to want is fast-read, slow-write memory. "Primary applications include model weights in ML inference, code pages, hot instruction paths, and relatively static data pages". Is there device physics for cheaper, smaller fast-read slow write memory cells for that?

For "hot instruction paths", caching is already the answer. Not sure about locality of reference for model weights. Do LLMs blow the cache?

➕ show 4 replies

dooglius • yesterday at 9:35 PM

I'm not seeing the case for adding this to general-purpose CPUs/software. Only a small portion of software is going to be able to be properly annotated to take advantage of this, so it'd be a pointless cost for the rest of users. Normally short-term access can easily become long-term in the tail the process gets preempted by something higher priority or spend a lot of time on an I/O operation. It's also not clear why if you had an efficient solution for the short-term case you wouldn't just add a refresh cycle and use it in place of normal SRAM as generic cache? These make a lot more sense in a dedicated hardware context -- like neural nets -- which I think is the authors' main target here.

➕ show 1 reply

staindk • yesterday at 9:35 PM

Sounds a bit like Intel's Optane which was seemed great in principle but I never had a use for it.

https://www.intel.com/content/www/us/en/products/details/mem...

https://en.wikipedia.org/wiki/3D_XPoint

➕ show 1 reply

meling • yesterday at 10:45 PM

Are there new physics on the horizon that could pave the way for new memory technologies?

pfdietz • today at 12:37 AM

Wouldn't a generational garbage collector automatically separate objects into appropriate lifetime categories?

➕ show 1 reply

imtringued • today at 7:11 AM

I don't know what the point of these fantasy computer papers are if there is no hardware implementation or even just a design of their concepts? Even managed retention memory is not a thing yet, so what's the point of all of this?

alt Hacker News

Towards Memory Specialization: A Case for Long-Term and Short-Term RAM

Comments