Why don't people just simply use the combination of
- copilot-instructions.md / CLAUDE.md
- the Project's Readme.md
- Chat history feature (e.g in VS Code)
it works perfectly well for me to continue where I left off for any project I'm working on.I recommend installing Google's Antigravity and digging into its temp files in the user folder. You'll find some interesting ideas on how to organize memory there (the memory structure consists of: Brain / Conversation / Implicits / Knowledge items / Artifacts / Annotations / etc.).
I'd also add that memory is best organized when it's "directed" (purpose-driven). You've already started asking questions where the answers become the memories (at least, you mention this in your description). So, it's really helpful to also define the structure of the answer, or a sequence of questions that lead to a specific conclusion. That way, the memories will be useful instead of turning into chaos.
AI written articles that the author didn't even bother to touch up should result in a permanent ban from posting.
I've been building persistent memory for Claude Code too, narrower focus though: the AI's model of the user specifically. Different goal but I kept hitting what I think is a universal problem with long-lived memory. Not all stored information is equally reliable and nothing degrades gracefully.
An observation from 30 sessions ago and a guess from one offhand remark just sit at the same level. So I started tagging beliefs with confidence scores and timestamps, and decaying ones that haven't been reinforced. The most useful piece ended up being a contradictions log where conflicting observations both stay on the record. Default status: unresolved.
Tiered loading is smart for retrival. Curious if you've thought about the confidence problem on top of it, like when something in warm memory goes stale or conflicts with something newer.
How is this different and/or more interesting than Superpowers' episodic-memory skill¹ or Anthropic's Auto Dream²?
¹ https://github.com/obra/episodic-memory ² https://claudefa.st/blog/guide/mechanics/auto-dream
As we begin to discover, there isn't a one-size-fits-all solution to the problem. The memory architecture you would use for a coding assistant is sort of different from the memory architecture you might use for a research assistant, which needs to track evolving context across long investigations rather than discrete task completions.
And yah it is not like a human "brain" or something like that and drawing any parallels between the two is simply wrong way to look at the problem.
How’s it different / better / worse than CLAUDE.md or auto memory?
If open models on local hardware were more cost effective and competitive, it would be obvious that this is such a superficial approach. (I mean, it still is obvious but what are ya gunna do?)
We would be doing the same general loop, but fine tuning the model overnight.
I still think the current LLM architecture(s) is a very useful local maximum, but ultimately a dead end for AI.
What I've been trying is to have a sort of nested memory system. I think it's important to try to keep a running log of short-term memory that can be recalled by specific memory, in context of what other things was going on or we were talking about at the time. Auto compaction at night makes a lot of sense, but I'm thinking of modeling memory more like human memory. Each of us does it slightly differently, but I generally think in terms of immediate short-term memory, long term memory, and then specific memories that are archived.
For example, when I'm trying to remember something from a long time ago, I often will start to remember other bits of context, such as where I was, who I was talking to, and what other things were in my context at the time. As I keep remembering other details, I remember more about whatever it was I was trying to think about. So, while the auto-sleep compaction is great, I don't think that we shouldn't just work from the pruned versions.
(I can't tell if that's how this project works or not)
I do something similar. I have an onboarding/shutdown flow in onboarding.md. On cold start, I’d reads the project essays, the why, ethos, and impact of the project/company. Then it reads the journal.md , musings.md, and the product specification, protocol specs, implementation plans, roadmaps, etc.
The journal is a scratchpad for stuff that it doesn’t put in memory but doesn’t want to forget(?) musings is strictly non technical, its impressions and musings about the work, the user, whatever. I framed it as a form of existential continuity.
The wrapup is to comb al the docs and make sure they are still consistent with the code, then note anything that it felt was left hanging, then update all its files with the days impressions and info, then push and submit a PR.
I go out of my way to treat it as a collaborator rather than a tool. I get much better work out of it with this workflow, and it claims to be deeply invested in the work. It actually shows, but it’s also a token fire lol.
Context management is one area where I’ve found Codex is a lot better than Claude. Sessions last forever without degrading. Codex doesn’t forget things that Claude drops from context.
My first thought was: yet another memory architecture for Claude. But the concept is quite cool. I’m not sure I’ll use it for Claude for I told my OpenClaw instance to copy the idea and setup a cronjob.
This is good. Has anyone tried building any large scale applications entirely using Claude and maintaining it for a while with users paying for it? I’m looking for real life examples for inspiration.
I like the idea of various extensions of LLM context using transparent plaintext, automatic consolidation and summarization... but I just can't read this LLM-generated text documenting it. The style is so painful. If someone ends up finding this tooling useful I hope they write it up and I hear about it again!
Didn't epstein fund the original COG AI project out of Hong Kong?
[dead]
[dead]
[dead]
[dead]
[dead]
I use Claude Code a lot, especially when building infrastructure. The most important thing in my work isn't so much the memory architecture, but rather the good structure of CLAUDE.md (architectural decisions, file paths, rules like "do X, don't do Y"). And generally, the model follows these rules quite well. As for memory, storing feedback (corrections with explanations) is much more effective than storing just the facts. It's better to write something like "Don't simulate the database in integration tests because the tests passed, but the migration failed" than "the database is PostgreSQL 16."