imo it would be better to carry the whole memory outside of the inference time where you could use an LLM as a judge to track the output of the chat and the prompts submitted
it would sort of work like grammarly itself and you can use it to metaprompt
i find all the memory tooling, even native ones on claude and chatgpt to be too intrusive
Totally get what you're saying! Having Claude manually call memory tools mid-conversation does feel intrusive, I agree with that, especially since you need to keep saying Yes to the tool access.
Your approach is actually really interesting, like a background process watching the conversation and deciding what's worth remembering. More passive, less in-your-face.
I thought about this too. The tradeoff I made:
Your approach (judge/watcher): - Pro: Zero interruption to conversation flow - Pro: Can use cheaper model for the judge - Con: Claude doesn't know what's in memory when responding - Con: Memory happens after the fact
Tool-based (current Recall): - Pro: Claude actively uses memory while thinking - Pro: Can retrieve relevant context mid-response - Con: Yeah, it's intrusive sometimes
Honestly both have merit. You could even do both, background judge for auto-capture, tools when Claude needs to look something up.
The Grammarly analogy is spot on. Passive monitoring vs active participation.
Have you built something with the judge pattern? I'd be curious how well it works for deciding what's memorable vs noise.
Maybe Recall needs a "passive mode" option where it just watches and suggests memories instead of Claude actively storing them. That's a cool idea.
I've been building exactly this. Currently a beta feature in my existing product. Can I reach out to you for your feedback on metaprompting/grammarly aspect of it?