Show HN: Recall: Give Claude memory with Redis-backed persistent context

147 points • by elfenleid • yesterday at 2:28 PM • 83 comments • view on HN

Hey HN! I'm José, and I built Recall to solve a problem that was driving me crazy.

The Problem: I use Claude for coding daily, but every conversation starts from scratch. I'd explain my architecture, coding standards, past decisions... then hit the context limit and lose everything. Next session? Start over.

The Solution: Recall is an MCP (Model Context Protocol) server that gives Claude persistent memory using Redis + semantic search. Think of it as long-term memory that survives context limits and session restarts.

How it works: - Claude stores important context as "memories" during conversations - Memories are embedded (OpenAI) and stored in Redis with metadata - Semantic search retrieves relevant memories automatically - Works across sessions, projects, even machines (if you use cloud Redis)

Key Features: - Global memories: Share context across all projects - Relationships: Link related memories into knowledge graphs - Versioning: Track how memories evolve over time - Templates: Reusable patterns for common workflows - Workspace isolation: Project A memories don't pollute Project B

Tech Stack: - TypeScript + MCP SDK - Redis for storage - OpenAI embeddings (text-embedding-3-small) - ~189KB bundle, runs locally

Current Stats: - 27 tools exposed to Claude - 10 context types (directives, decisions, patterns, etc.) - Sub-second semantic search on 10k+ memories - Works with Claude Desktop, Claude Code, any MCP client

Example Use Case: I'm building an e-commerce platform. I told Claude once: "We use Tailwind, prefer composition API, API rate limit is 1000/min." Now every conversation, Claude remembers and applies these preferences automatically.

What's Next (v1.6.0 in progress): - CI/CD pipeline with GitHub Actions - Docker support for easy deployment - Proper test suite with Vitest - Better error messages and logging

Try it:

npm install -g @joseairosa/recall # Add to claude_desktop_config.json # Start using persistent memory

Comments

pacoWebConsult • yesterday at 3:16 PM

Why would you bloat the (already crowded) context window with 27 tools instead of the 2 simplest ones: Save Memory & Search Memory? Or even just search, handling the save process through a listener on a directory of markdown memory files that Claude Code can natively edit?

➕ show 5 replies

dfee • today at 2:32 AM

The code is written by Claude, the README is written by Claude, this HN post is written by Claude.

My God, there’s no signal. It’s all noise.

➕ show 1 reply

bryanhogan • yesterday at 3:11 PM

Why would you not use context files in form of .md? E.g. how the SpecKit project does it.

➕ show 2 replies

datadrivenangel • yesterday at 4:26 PM

How does Claude know when to try and remember?

Often memory works too well and crowds out new things, so how are you balancing that?

➕ show 1 reply

Merad • today at 2:32 AM

I built a memory tool about 6 months while playing with MCP, it was based on a SQLite db. My experience then was that Claude wasn't very good at using the tools. Even with instructions to be proactive about searching memory and saving new memories it would rarely do so. Once you did press it to be sure to save memories it would go overboard, basically saving every message in the conversation as a memory. Are seeing more success in getting natural and seamless usage of the memory tools?

IIRC at the time I was testing with Sonnet 3.7, I haven't tried it on the newer models.

Repo here: https://github.com/mbcrawfo/KnowledgeBaseServer

➕ show 1 reply

DenisM • yesterday at 7:56 PM

I think everyone concluded at this point that we need to improve models memory capabilities, but different people take different approach.

My experience is that ChatGPT can engage in a very thoughtful conversations but if I ask for a summary it makes something very generic, useful to an outsider, but it does not catch salient points which were the most important outcomes.

Did you notice the same problem?

➕ show 1 reply

iambateman • yesterday at 3:53 PM

I’ve started asking Claude to write tutorials that live in a _docs folder alongside my code.

Then it can reference those tutorials for specific things.

Interested in giving this a shot but it feels like a lot of infrastructure.

➕ show 1 reply

ffsm8 • yesterday at 7:36 PM

The memory feature I'd like to have would need built-in support from anthropic

It'd be essentially

1. Language server support for lookups & keeping track of the code

2. Being able to "pin" memories to functions, classes, properties etc via the language server support/providing this context whenever changes are made in this function/class/properties etc, but not kept, so all following changes outside of that will no longer include this context (basically, changes that touch code with which memories will be done by agents with additional context, and only the results are synced back, not the way to achieve it)

3. Provide a ide integration for this context so you can easily keep track of what's available just by moving the cursor to the point the memory is pinned at

Sadly impossible to achieve via MCP.

btbuildem • today at 12:34 AM

A great hack/shortcut for solving this "memory" problem is to have a rolling RAG KB. You don't fill up the context, and you can use a re-ranking model to further improve accuracy.

Aside from all that, using npm for distribution makes this a total non-starter for me.

➕ show 1 reply

ra • yesterday at 11:10 PM

I built something similar but now use Codex instead.

Using the VS Code extension you get dynamic context management which works really well.

They also have a memory system built using reflexion (someone please correct me if I'm wrong) so proper evals are derived from lessons before storing.

daxfohl • yesterday at 4:32 PM

I'm surprised Anthropic doesn't offer something like this server-side, with an API to control it. Seems like it'd be a lot more efficient than having client manually reworking the context and uploading the whole thing.

➕ show 2 replies

DenisM • yesterday at 5:32 PM

Imho you would have an easier sell if you separate knowledge into tiers: 1)overall design 2) coding standards 3) reasoning that lead to design 4) components and their individual structure 5) your current issue 6) etc

Your project becomes progressively more valuable the further you go down the list. The overall design should be documented and curated to onboard new hires. Documenting current issues is a waste of time compared to capturing live discussion, so Recall is super useful here.

tarun_anand • yesterday at 3:22 PM

Claude introduced it's own memories api.. have you had a look?

➕ show 1 reply

the_arun • yesterday at 3:50 PM

I wish there was a way to send compressed context to LLMs instead of plain text. This will reduce token size, performance & operational costs.

➕ show 1 reply

thund • today at 1:02 AM

Seems overkill when you can simply tell agents to do that automatically

jumski • yesterday at 6:09 PM

Memory is hard! I'm very curious how the version history approach is working for you? Have you considered an age when retrieving? Is model supposed to manage the version history on its own? Is the semantic search used to help with that?

asdev • yesterday at 3:58 PM

The problem is you need to tell prompt Claude to "Store" or "Remember", if you don't it will never call the MCP server. Ideally, Claude would have some mechanism to store memories without any explicit prompting but I don't think that's currently possible today.

➕ show 1 reply

warthog • yesterday at 3:20 PM

imo it would be better to carry the whole memory outside of the inference time where you could use an LLM as a judge to track the output of the chat and the prompts submitted

it would sort of work like grammarly itself and you can use it to metaprompt

i find all the memory tooling, even native ones on claude and chatgpt to be too intrusive

➕ show 2 replies

nibab • today at 12:33 AM

Isn’t that what agents.md or Claude.md is for?

➕ show 1 reply

h1fra • yesterday at 3:38 PM

I'm not super familiar with context and "memory", but adding context manually or via memory doesn't end up consuming context length either way?

➕ show 1 reply

aktuel • yesterday at 7:12 PM

Wouldn't the cache over time also be filled up with irrelevant and redundant information?

uncletoxa • yesterday at 10:18 PM

Do you think any vector db would work better than redis?

➕ show 1 reply

alecco • yesterday at 3:48 PM

Why not just ask CC to write a prompt or Markdown file to re-start the conversation in a new chat?

➕ show 1 reply

gmerc • yesterday at 4:43 PM

Every single persistent memory feature is a persistence vector for prompt injection.

jcmontx • yesterday at 3:07 PM

If this delivers can be 100% game changer, I will try it out and give some feedback

➕ show 1 reply

mannyv • yesterday at 3:32 PM

This is excellent for those of us who are building local AIs.

➕ show 1 reply

otterley • yesterday at 3:36 PM

Does it work with Valkey as well?

➕ show 1 reply

moomoo11 • yesterday at 10:29 PM

Throwing it out there, not sure how well it'd work but what about using OpenSearch + vector?

AI can already form the query DSL quite nicely especially if it knows the indexes.

I set up AI powered search this way, and it works really well with any open ended questions.

bananapub • yesterday at 4:09 PM

how did you benchmark this against much less convoluted solutions, like "a text file"?

how much better was this to justify all that extra complexity?

iamleppert • yesterday at 4:02 PM

I'm not seeing how this is any different than a standard vector database MCP tool. It's not like Claude is going to know about any of the things you told it to "remember" unless you explicitly tell it to use its memory tool like shown in the demo, to remember something you've stored.

jMyles • yesterday at 3:59 PM

Heh, I'm building the same thing this week (albeit with postgres rather than redis). I bet like 15% of the people here are.

➕ show 1 reply

aiisthefiture • yesterday at 7:35 PM

With redis? Why?

➕ show 1 reply

alt Hacker News

Show HN: Recall: Give Claude memory with Redis-backed persistent context

Comments