Increasingly, I'd like the code to live alongside a journal and research log. My workflow right now is spending most of my time in Obsidian writing design docs for features, and then manually managing claude sessions that I paste them back and forth into. I have a page in obsidian for each ongoing session, and I record my prompts, forked paths, thoughts on future directions, etc. It seems natural that at some point this (code, journal, LLM context) will all be unified.
I think this is a lot of "kicking can down the road" of not understanding what code the ai is writing. Once you give up understanding the code that is written there is no going back. You can add all the helper commit messages, architecture designs, plans, but then you introduce the problem of having to read all of those once you run into an issue. We've left readability on the wayside to the alter of "writeability".
The paradigm shift, which is a shift back, is to embrace the fact that you have to slow down, and understand all the code the ai is writing.
I think that's covered by the YAGNI rule. It has very little value that rapidly drops off as you commit more code. Maybe some types of software you might want to store some stuff for compliance/auditing reasons. But beyond that, I don't see what you would use it for.
I think the decisions it made along the way are worth tracking. And it’s got some useful side effects with regard to actually going through the programming and architecture process. I made a tool that really helps with this and finds a pretty portable middle ground that can be used by one person or a team too, it’s flexible. https://deciduous.dev/
Yes, it should remain part of the commit, and the work plan too, including judgements/reviews done with other agents. The chat log encodes user intent in raw form, which justifies tasks which in turn justify the code and its tests. Bottom up we say the tests satisfy the code, which satisfies the plan and finally the user intent. You can do the "satisfied/justified" game across the stack.
I only log my own user messages not AI responses in a chat_log.md file, which is created by user message hook in the repo.
It is a useful piece of information, but the session is not “long lived” in terms of git commit history lifetime.
YES! The session becomes the source code.
Back in the dark ages, you'd "cc -s hello.c" to check the assembler source. With time we stopped doing that and hello.c became the originating artefact. On the same basis the session becomes the originating artefact.
The session capture problem is harder than it looks because you need to capture intent, not steps.
A coding session has a lot of 'left turn, dead end, backtrack' noise that buries the decision that actually mattered. Committing the full session is like committing compiler output — technically complete, practically unreadable.
We've been experimenting with structured post-task reflections instead: after completing significant work, capture what you tried, what failed, what you'd do differently, and the actual decision reasoning. A few hundred tokens instead of tens of thousands. Commits with a reflection pointer rather than an embedded session.
The result is more useful than raw logs. Future engineers (or future AI sessions) can understand intent without replaying the whole conversation. It's closer to how good commit messages work — not 'here's what changed' but 'here's why'.
Dang's point about there being no single session is also real. Our biggest tasks span multiple sessions and multiple contributors. 'Capture the session' doesn't compose. 'Capture the decision' does.
For my own projects in private repos I would benefit from exporting the session. For example if I need to return to the task, it could be great to give it as a context
For my work as one of developers in team, no. The way I prompt is my asset and advantage over others in a team who always complain about AI not being able to provide correct solutions and secures my career
I've gotten into the habit of having the LLM produce a description of their process and summarize the change, Than I add that along with the model I used after my own commit message. It lets me know where I use AI and what I thought it did as well as what I thought it did.
The entire prompt and process would be fine if my git history was subject to research but really it is a tool for me or anyone else who wants to know what happened at a given time.
Have AI explain the reasoning behind the PR. I don't think people really care about your step by step process but reviewers might care about your approach, design choices, caveats, and trade offs.
That context could clarify the problem, why the solution was chosen, key assumptions, potential risks, and future work.
No. Even further than that, maintaining AGENTS.md and the like in your company repo, you basically train your own replacement. Which replacement will not be as capable as you in the long run, but few businesses will care. Anyway having some representation of an employee's thinking definitely lowers cost of firing that employee.
That is a cynical take and not very different from an advice to never write any documentation, or never help your teammates. Only that resemblance is superficial. In any organization you shouldn't help people stealing you time for their benefit (Sean Goedecke calls them predators https://www.seangoedecke.com/predators/).
On the other hand, it may be beneficial to privately save CLAUDE.md and other parts of persistent context. You may gitignore them (but that will be conspicuous unless you also gitignore .gitignore) or just load them from ~/.claude
I expect an enterprise version of Claude Code that will save any human input to the org servers for later use.
No, because if AI is set to replace a human – their prompting skill and approach are the only things differentiating them from the rest of the grey mass.
If the full session capture is not encoded s.t it provides insight into architecture/mistakes, what was the point? There needs to be 1. complete capture (all tool calls etc) as well as 2. which is also curated to be readable (collapsible, chronological, easy to navigate etc). A .txt dump of agent COT is not particularly useful to anyone aside from another agent.
It's already bad enough that people are saying there's too much code to read and review. You want to add session to it? Running it again, might not yield the same output. These models are non deterministic and models are often changed and upgraded.
I put a link to the LLM session at the end of the commit, and prefix with POH: if I wrote it by hand.
POH = Plain Old Human
Easy to achieve.
Why NOT include a link back? Why deprive yourself of information?
If you need LLM sessions included to understand or explain commits, you're doing something wrong.
Saving sessions is even more pointless without the full context the LLM uses that is hidden from the user. That's too noisy.
I haven't adopted this yet, but have a feeling that something like this is the right level of recording the llm contribution / session https://blog.bryanl.dev/posts/change-intent-records
The last 5 sessions. Beyond that I archive them outside the repo. But I do save them for review and summaries.
In our (small) team, we’ve taken to documenting/disclosing what part(s) of the process an LLM tool played in the proposed changes. We’ve all agreed that we like this better, both as submitters and reviewers. And though we’ve discussed why, none of us has coined exactly WHY we like this model better.
If you can, run several agents. They document their process. Trade offs considered, reasoning. Etc. it’s not a full log of the session but a reasonable history of how the code came to be. Commit it with the code. Namespace it however you want.
I drop a lot of F-bombs and other unpleasantries when I talk to the robots, so I'd rather not.
Everything in git can and must be merge-able when merging branches. After all, git is a collaboration tool, not a undo-redo stack.
If a person writes code, should all the process be part of the commit?
If a human writes code, should the jira ticket be part of the commit? I am actually thinking about potential merits.
One of the use cases i see for this tool is helping companies to understand the output coming from the llm blackbox and the process which the employee took to complete a certain task
If AI could reliably write good code then you shouldn't need to even commit the code as the general rule is you shouldn't commit generated code. Commit the session when you don't need to commit the code
instead of committing code, we should just save videos of all of the zoom meetings about the code
What's the value given answers are not deterministic.
pre-ai if I had to include Google search queries in a commit, I’d be so embarrassed I’d probably never commit code like ever
Isn’t that what entire.io, founded by former GitHub CEO, is doing?
In general, no, but sometimes, yes, or at least linked from the commit the same way user stories/issues are. Admittedly the 'sometimes' from my perspective is mostly when there's a need to educate fellow humans about what's possible or about good prompt techniques and workarounds for the AI being dumb. It can also reveal more of x% by AI, y% by human by for example diffing the outputs from the session against the final commits.
A summary of the session should be part of the commit message.
Nope. Especially with these agents the thinking trace can get very large. No human will ever read it, and the agent will fill up their context with garbage trying to look for information.
I understand the drive for stabilizing control and consistency, but this ain't the way.
Maybe Git isn't the right tool to track the sessions. Some kind of new Semi-Human Intelligence Tracking tool. It will need a clever and shorter name though.
hell to the no, in between coding sessions, I go out on plenty of sidebars about random topics that help me, the prompter understand the problem more. Prompts in this way are entirely related to context (pre-knowledge) that is not available to the LLMs.
isn't a similar thing done by entire cli? the startup which raised $60M seed recently
In principle, the documentation that's included in the code edit should have all the relevant information that a future agent would need.
I agree so much
I’ve had the same thought, but after playing around with it, it just seems like adding noise. I never find myself looking at generated code and wondering “what prompt lead to that?” There’s no point, I won’t get any kind of useful response - I’m better off talking to the developer who committed it, that’s how code review works.
Proof sketch is not proof
This would just record a lot of me cursing at and calling the AI an idiot.
Like any discussion about AI there are two things people are talking about here and it's not always clear which:
1. Using LLMs as a tool but still very much crafting the software "by hand",
2. Just prompting LLMs, not reading or understanding the source code and just running the software to verify the output.
A lot of comments here seem to be thinking of 1. But I'm pretty sure the OP is thinking of 2.
Yes.
EOM
No. Prompt-like document is enough. (e.g. skills, AGENTS.md)
I feel like publishing the session is like publishing a sketch book. I don't need all of my mistakes and dumb questions recorded.
If that was important, why are we not already doing things like this. Should I have always been putting my browser history in commits?
I include my "plans" and a link to my transcript on all my PRs that include AI-generated code. If nothing else, others on my team can learn from them.
obligatory: git notes
Lots of comments mentioned this, for those who aren't aware, please checkout
Git Notes: Git's coolest, most unloved feature (2022)
https://news.ycombinator.com/item?id=44345334
I think it's a perfect match for this case.
If the model in use is managed by a 3rd party, can be updated at will, and also gives different output each time it is interacted with, what is the main benefit?
If I chat with an agent and give an initial prompt, and it gets "aspect A" (some arbitrary aspect of the expected code) wrong, I'll iterate to get "aspect A" corrected. Other aspects of the output may have exactly matched my (potentially unstated) expectation.
If I feed the initial prompt into the agent at some later date, should I expect exactly "aspect A" to be incorrect again? It seems more likely the result will be different, maybe with some other aspects being "unexpected". Maybe these new problems weren't even discussed in the initial archived chat log, since at that time they happened to be generated in a way in alignment with the original engineers expectation.