logoalt Hacker News

Show HN: 20+ Claude Code agents coordinating on real work (open source)

35 pointsby austinbaggiotoday at 4:23 PM32 commentsview on HN

Single-agent LLMs suck at long-running complex tasks.

We’ve open-sourced a multi-agent orchestrator that we’ve been using to handle long-running LLM tasks. We found that single LLM agents tend to stall, loop, or generate non-compiling code, so we built a harness for agents to coordinate over shared context while work is in progress.

How it works: 1. Orchestrator agent that manages task decomposition 2. Sub-agents for parallel work 3. Subscriptions to task state and progress 4. Real-time sharing of intermediate discoveries between agents

We tested this on a Putnam-level math problem, but the pattern generalizes to things like refactors, app builds, and long research. It’s packaged as a Claude Code skill and designed to be small, readable, and modifiable.

Use it, break it, tell me about what workloads we should try and run next!


Comments

giancarlostorotoday at 7:16 PM

I feel like there's two camps:

* Throw more agents * Use something like Beads

I'm in the latter, I don't have infinite resources, I'd rather stick to one agent and optimize what it can do. When I hit my Claude Code limit, I stop, I use Claude Code primarily for side projects.

show 2 replies
raniazyanetoday at 8:27 PM

I wonder if there’s a third camp that isn’t about agent count at all, but about decision boundaries.

At some point the interesting question isn’t whether one agent or twenty agents can coordinate better, but which decisions we’re comfortable fully delegating versus which ones feel like they need a human checkpoint.

Multi-agent systems solve coordination and memory scaling, but they also make it easier to move further away from direct human oversight. I’m curious how people here think about where that boundary should sit — especially for tasks that have real downstream consequences.

show 1 reply
visargatoday at 6:59 PM

Great work! I like the approach of maximum freedom inside bounded blast radius and how you use code to encode policy.

show 1 reply
miligausstoday at 4:39 PM

It's a more of a black box with claude, at least with this you see the proof strategy and mistakes made by the model when it decomposes the problem. I think instead of Ralph looping you get something that is top-down. If models were smarter and context windows bigger i am sure complex tasks like this one would be simpler, but braking it down into sub agents and having a collective --"we already tried this strategy and it backtracked"-- intelligence is a nice way to scope a limited context window to an independent sub problem.

show 1 reply
yodontoday at 5:26 PM

The first screen of your signup flow asks for "organization" - is that used as a username or as an organization name or both (I can't tell what if anything will be on the next screen)

If your registration process is eventually going to ask me for a username, can the org name and user name be the same?

show 2 replies
clairekarttoday at 4:31 PM

What’s the failure mode you see with single-agent Claude Code on complex tasks? (looping, context drift, plan collapse, tool misuse?)

show 1 reply
yodontoday at 4:40 PM

Can you add a license.txt file so we know we have permission to run this (eg MIT and GPL V3 are very different)

show 1 reply
christinetyiptoday at 4:49 PM

Cool, what’s a good first task to try this on where it’s likely to beat a single agent?

show 2 replies
slopusilatoday at 5:13 PM

seems like it requires an API key to your proprietary Ensue memory system

show 1 reply
zmaniantoday at 4:52 PM

“How does progress subscription work — are agents watching specific signals (test failures, TODO list, build status), or just a global feed?”

show 1 reply