logoalt Hacker News

takwatanabeyesterday at 11:55 PM3 repliesview on HN

The post-it note analogy is good, but as a psychiatrist, I'd frame it differently: LLMs are essentially patients with anterograde amnesia.

They can reason brilliantly within a single conversation — just like an amnesic patient can hold an intelligent discussion — but the moment the session ends, everything is gone. No learning happened. No memory formed.

What's worse, even within a session, they degrade. Research shows that effective context utilization drops to <1% of the nominal window on some tasks (Paulsen 2025). Claude 3.5 Sonnet's 200K context has an effective window of ~4K on certain benchmarks. Du et al. (EMNLP 2025) found that context length alone causes 13-85% performance degradation — even when all irrelevant tokens are removed. Length itself is the poison.

This pattern is structurally identical to what I see in clinical practice every day. Anxiety fills working memory with background worry, hallucinations inject noise tokens, depressive rumination creates circular context that blocks updating. In every case, the treatment is the same: clear the context. Medication, sleep, or — for an LLM — a fresh session.

The industry keeps betting on bigger context windows, but that's expanding warehouse floor space while the desk stays the same size. The human brain solved this hundreds of millions of years ago: store everything in long-term memory, recall selectively when needed, consolidate during sleep, and actively forget what's no longer useful.

We can build the smartest single model in the world — the greatest genius humanity has ever seen — but a genius with no memory and no sleep is still just an amnesic savant. The ceiling isn't intelligence. It's architecture.


Replies

0xbadcafebeetoday at 2:34 AM

Yep. It's the guy from the movie "Memento" doing your physics homework on a couple pages of legal paper. When he runs out of paper, he has to write a post-it note summarizing it all, then burn the papers, and his memory resets. You can only do so much with that.

If we can crack long memory we're most of the way there. But you need RL in addition to long memory or the model doesn't improve. Part of the genius of humans is their adaptability. Show them how to make coffee with one coffee machine, they adapt to pretty much every other coffee machine; that's not just memory, that's RL. (Or a simpler example: crows are more capable of learning and acting with memory than an LLM is)

Currently the only way around both of these is brute-force (take in RL input from users/experiments, re-train the models constantly), and that's both very slow and error-prone (the flaws in models' thinking comes from lack of high-quality RL inputs). So without two major breakthoughs we're stuck tweaking what we got.

show 1 reply
ACCount37today at 1:21 AM

A lot of that seems to be the usual "you're training them wrong".

Sonnet 3.5 is old hat, and today's Sonnet 4.6 ships with an extra long 1M context window. And performs better on long context tasks while at it.

There are also attempts to address long context attention performance on the architectural side - streaming, learned KV dropout, differential attention. All of which can allow LLMs to sustain longer sessions and leverage longer contexts better.

If we're comparing to wet meat, then the closest thing humans have to context is working memory. Which humans also get a limited amount of - but can use to do complex work by loading things in and out of it. Which LLMs can also be trained to do. Today's tools like file search and context compression are crude versions of that.

show 1 reply
torginustoday at 12:12 AM

I want to believe I'm reading an insightful comment from an actual human deeply familiar with both human congnition and how LLMs work, but this post is chock full of LLMisms

show 1 reply