IMO: This might be a contrarian opinion, but I don't think so. Its much the same problem as asking, for example, if every single line you write, or every function, becomes a commit. The answer to this granularity is, much like anything, you have to think of the audience: Who is served by persisting these sessions? I would suspect that there is little reason why future engineers, or future LLMs, would need access to them; they likely contain a significant amount of noise, incorrect implementations, and red herrings. The product of the session is what matters.
I do think there's more value in ensuring that the initial spec, or the "first prompt" (which IME is usually much bigger and tries to get 80% of the way there) is stored. And, maybe part of the product is an LLM summary of that spec, the changes we made to the spec within the session, and a summary of what is built. But... that could be the commit message? Or just in a markdown file. Or in Notion or whatever.
> Its much the same problem as asking, for example, if every single line you write, or every function, becomes a commit.
Hmm, I think that's the wrong comparison? The more useful comparison might be: should all your notes you made and dead ends you tried become part of the commit?
There is some potential value for the audit if you work in a special place where you are sworn in and where transparency is important, but who gonna read all of that and how do you even know that the transcript corresponds to the code if the committer is up to something
This is a central problem that weve already seen proliferate wildly in Scientific research , and currently if the same is allowed to be embedded in foundational code. The future outlook would be grim.
Replication crisis[1].
Given initial conditions and even accounting for 'noise' would a LLm arrive at the same output.It should , for the same reason math problems require one to show their working. Scientific papers require the methods and pseudocode while also requireing limitations to be stated.
Without similar guardrails , maintainance and extension of future code becomes a choose your own adventure.Where you have to guess at the intent and conditions of the LLM used.
[1] https://www.ipr.northwestern.edu/news/2024/an-existential-cr...
I agree that probably not everything should be stored - it’s too noisy. But the reason the session is so interesting is precisely the later part of the conversation - all the corrections in the details, where the actual, more precise requirements crystallize.
For me, it’s about preserving optionality.
If I can run resume {session_id} within 30 days of a file’s latest change, there’s a strong chance I’ll continue evolving that story thread—or at least I’ve removed the friction if I choose to.
> Who is served by persisting these sessions? I would suspect that there is little reason why future engineers, or future LLMs, would need access to them
I disagree. When working on legacy code, one of my biggest issues is usually the question 'why is this the way it is?' Devs hate documentation, Jira often isn't updated with decisions made during programming, so sometimes you just have to guess why 'wait(500)' or 'n = n - 1' are there.
If it was written with AI and the conversation history is available, I can ask my AI: 'why is this code here?', which would often save me a ton of time and headache when touching that code in the future.
First N prompts is a good / practical heuristic for something worth storing (whether N = 1 or greater).
You ignore the reality of vibe coding. If someone just prompts and never reads the code and tests the result barely, then the prompts can be a valuable insight.
But I am not rooting for either, just saying.
LLM session transcripts as part of the commit is a neat idea to consider, to be sure, but I know that I damn well don't want to read eight pages of "You're absolutely right! It's not a foo. It's a bar" slop (for each commit no less!) when I'm trying to find someone to git blame.
The solution is as it always has been: the commit message is where you convey to your fellow humans, succinctly and clearly, why you made the commit.
I like the idea of committing the initial transcript somewhere in the docs/ directory or something. I'll very likely start doing this in my side projects.
While it's noisy and complicated for humans to read through, this session info is primarily for future AI to read and use as additional input for their tasks.
We could have LLMs ingest all these historical sessions, and use them as context for the current session. Basically treat the current session as an extension of a much, much longer previous session.
Plus, future models might be able to "understand" the limitations of current models, and use the historical session info to identity where the generated code could have deviated from user intention. That might be useful for generating code, or just more efficient analysis by focusing on possible "hotspots", etc.
Basically, it's high time we start capturing any and all human input for future models, especially open source model development, because I'm sure the companies already have a bunch of this kind of data.