> notes/ — Narrative. What happened each session — decisions, actions, open items. Append-only. Never modified after the day ends
I already have to fight the agent constantly to prevent it adding backwards compatibility, workarounds, wrappers etc. for things that I changed or removed. If there's even one forgotten comment that references the old way, it'll write a whole converter system if I walk out of the room for 5 minutes, just in case we ever need it, and even though my agents file specifically says not to (YAGNI, all backwards compatibility must be cleared by me, no wrappers without my explicit approval, etc.). Having a log of the things we tried last month but which failed and got discarded sounds like a bad idea, unless it's specifically curated to say "list of things that failed: ...", which by definition, an append only log can't do.
I have hit the situation where it discovered removed systems through git history, even. At least that's rare though.
the one file that does the same (maybe code focused, easily adapted)
---
Only documentation to write is project.md and TODO.md do not write documentation anywhere else.
TODO.md: document gaps, tasks and progress, grouped by feature
project.md: document architecture, responsability map, features and the tribal knowledge needed to find things
Do not document code, method, classes.
STANDARD OPERATING PROCEDURES:
Beginning of task:
- read: goals.md tech.md project.md
- update TODO.md add step by step [ ] tasks under the # feature you will implement
During execution of task:
- perform the task step by step, delegate if possible to sub tasks or sub agents
- log with [x] the work performed in TODO.md as you go
End of task:
- remove completed features from the TODO.md
- maintain project.md
Text files + Git > Vector DBs. Nice work. I'm curious about how this scales. As the notes/ directory grows over weeks or months, reading past daily logs will eat up the context window.
I've been working with an agent as a secretary for 3-4 weeks now. CLAUDE.md, daily journal, state file, pipeline tracking.
bigbezet is right, agents have no clue what's worth remembering. What works for me is splitting it: the agent writes what happened, I decide what actually matters. Two places to manage: journal and the STATE.MD, which I request to maintain based on my expectations. Agent can read a journal if it needs, but the main place to check the status is STATE.md.
One thing I haven't seen anyone mention, though. After a few weeks of reading your rants about some coworker, the agent just takes your side on everything. Had to literally add "consider the other person's perspective" to my rules file. It just has too many one-sided notes in the journal. Otherwise you end up with a yes-man that has perfect memory.
The trauma replay thing gaigalas mentioned is real too. I found it hard to not make agent be biased. To be frank, even I'm noticing something like this: - I complain, agent defends me. - I'm putting into the chat a response from other llm which was not biased by my journal. It flips sides and now says the research makes much sense. - I say: "How much biased you are right now." and it responds something about being biased and "... to be frank, the truth is: ...". Even when asking for not being biased, is starts to play biased because it thinks I expect that. Sneaky bastard.
> AI agents already read AGENTS.md (or CLAUDE.md, .cursorrules, etc.) as project instructions. This kernel uses that mechanism to teach the agent how to remember.
Dude, this is just prompts. It is as useful as asking claude code to write these files itself.
It's curious when agents remember traumatic events and replay them instead of avoiding them.
I was stuck on a task for a couple of days. Deleted the memory about some debugging sessions, thing just unlocked itself again. The harness was basically replaying the trauma over and over again.
I honestly think it's better to not have stateful stuff when working with agents.
Yeah, all the claw based agents use similar structure. I experimented a little with SQLite DB with embedding so it could vector search but I did not manage to get it to do better. Best is still to just stuff things in context and let it full text search history.
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
[dead]
A lot of devs, including me, have tried something similar already. I don't really find this approach reliable.
Firstly, tools like Claude Code already have things like auto memory.
Secondly, I think we all learned by now that agents will not always reliably follow instructions in the AGENTS.md file, especially as the size of the context increases. If we want to guarantee that something happens, we should use hooks.
There are already solutions to track what the agent did and even summarising it without affecting the context window of the agent. Tools like Claude Code log activity in a way which is possible to analyze, so you can use tools to process that.
When I tried something similar in the past, the agent would not really understand what is important to "memorise" in a KNOWLEDGE.md file, and would create a lot of bloat which I would then need to clean up anyway.
There are existing tools to tell the agent what has happened recently: git. By looking at the commit messages and list of changed files, the agent usually gets most of the information it needs. If there are any very important decisions or learnings which are necessary for the agent to understand more, they should be written down >manually< by a developer, as I don't trust the agent to decide that.
Also, there is an ongoing discussion about whether AGENTS.md files are even needed, or whether they should be kept to an absolute minimum. Despite what we all initially thought, those files can actually negatively affect the output, based on recent research.