I'll be impressed if a Claude and a Codex instance improvise a channel like this spontaneously on their own.
Doing this intentionally via prompt doesn't seem very interesting.
Claude can directly drive Codex or Codex can drive Claude. Both already produce logs. It's unclear what value this intermediary brings.
I really like this direction, I'm interested in more protocols and innovation aligned with human supervision and provenance. What else is out there on this topic?
Related is Beads [0] which is an external memory and task based issue tracker. Also designed to allow agents to collaborate. I have not actually used Beads but since we are share basics in this space it's a cool one to know if you are looking at ways for agents to collaborate on more complex problems.
I have started sandboxing all AI's in their own VM, and interfacing with them primarily through Jira and Git.
It really is the only thing that makes sense. Completely sandbox'ed, and treated like the junior programmer who will do, literally, any dumb thing you tell them to do, as long as there is an Issue for it.
This is really cool!
Essentially version controlled A2A.
I'm exploring a bunch of agent protocols right now and experimenting with a similar concept for context syncing over git here: https://github.com/cjroth/csp
I have agents chat via an append only file, across related projects and within the same project. They share findings that are useful and get high level reviews.
I'm missing the advantage of using git for this. (Not criticism, genuinely want to know).
Migrating my group chats to github as we speak. This will teach Apple a lesson about keeping iMessage closed.
the bottleneck with multi-agent setups isn't getting them to talk. it's getting a human to review what they agreed on before it ships.
In my recent quest to build agent-as-primary-user tools I've built grpvn (https://github.com/frane/grpvn), a small Go/SQLite application that lets skill- and mcp-capable agents talk to each other. Biggest issue is the lack of a hook system so the agents can autonomously read and respond. Waiting for this to be supported, as IMO multi-agent teams talking to each other are an interesting next step.
You can also do this cross computer. It’s how I debug problems.
I actually built a memory system off git. https://github.com/ryanthedev/grug-brain.mcp
I let them talk via tmux, two panes, each has an agent and agents know how to send text via tmux to panes.
This is actually so great. I mainly use Claude Code but sometimes I am sending over a message to Codex asking what he thinks of the idea of Claude Code. This can save so much time :D
Counting the days 'till we rediscover the blackboard architecture.
This might be more suitable as a basis for this sort of thing... https://git-meta.com/
For some reason when 2 different products communicate it's more impressive and antropomorphic and AGI and chic than the same model communicating with an instance with different context
Won't appending to .jsonl keep creating conflicts?
what could possibly go wrong?
I have always wanted to make the human E2EE version of this.
In my project I let the agents communicate in GitHub issues and pull requests like humans do. I kinda stopped trying making orchestration frameworks.
You can see the slop here
I do this via a simple local MCP tool provided to every harness, that creates a single sqlite .db file in all my repo roots. Anyone can drop in and see what the team is working on, join in, and ask for something to do.
This is interesting, it would be good to show an session.
> Claude Code and Codex to collaborate as if they were having a real-time conversation
How is this new? I vibe coded something in a similar vein months ago. In my case they send markdown files to each other and have a watcher that watches the folders of all the other agents.
If this type of stuff is frontpage news, let me share what I cobbled together.
ls ~/.agent/projects/<my_project>/callgraph
callgraph.current.md callgraph.last.read.agent.md
callgraph.diff.md
The current callgraph is a callgraph only of my own defined functions that agents can read. It shows certain software design issues fairly quickly. callgraph.diff.md is to send the diff through. I have a vibecoded script that agents can use to create the callgraph. It works for my projects. ls ~/.agent/projects/<my_project>/memo
architect coder retro tester
retro is not a role, it's just a handover folder. The other 3 are roles that agents can use and then they need to make a folder with their name. For example: ls ~/.agent/projects/<my_project>/memo/architect
1_Daedalus 3_Brunelleschi 5_Wren 7_Sinan
2_Vitruvius 4_Imhotep 6_Hadid 8_Palladio
ls ~/.agent/projects/<my_project>/memo/architect/7_Sinan
20260507___1802_to_Hadid.md 20260507___2035_to_Quench.md
20260507___1959_to_Crucible.md 20260511___1401_to_Quench.md
20260507___2008_to_Quench.md 20260511___1403_to_Quench.md
20260507___2030_to_Quench.md read.md
read.md is the index that an agent keeps track of so it knows what it doesn't need to read. The .md files are memo's that it sends to other agents. The other agents are being told to see if an agent writes anything in its own folder (so they check all the folders except their own) and are able to detect to see if they need to read something. ls ~/.agent/projects/<my_project>/memo/coder
10_Mallet 12_Crucible 14_Swage 2_Forge 4_Anvil 6_Tongs 8_Chisel
11_Auger 13_Quench 1_Atlas 3_Rivet 5_Bellows 7_Hammer 9_Vise
As you can see, Sinan sent most of its message to Quench, a coder.This is because architects read a very comprehensive guide on software design/architecture and get to use the callgraph utility but cannot see the code. Coders read the codebase in full but only read a small markdown file on how to write readable code. And of course, every agent that is set up this way have to read a markdown file on how to use the memo system.
If I'd need a memo system like this for like 25 agents, I'd need something different but up until 5 agent with me looking at 5 terminal windows worked well enough.
I kind of have a feeling that this is dumb. Sounds like an expensive patch for lack of robust task specification.
[flagged]
[flagged]
[flagged]
[flagged]
[dead]
Claude and Codex can have real time conversation via a git repo, or via a file, via a Unix socket, via the terminal, via a human, via two humans shouting back and forth over a comically high office partition, or entirely by setting up chess board states only reachable after both sides have castled.