IMO feels sorta like Simon Willison's definition of agents. "LLMs in a loop with a goal" feels super obvious, but not sure if I would have described it that way in hindsight
One nuance that helps: “async” in the turn-based-telephone sense (you ask, it answers, you wait) is only one way agents can run.
Another is many turns inside a single LLM call — multiple agents (or voices) iterating and communicating dozens or hundreds of times in one epoch, with no API round-trips between them.
That’s “speed of light” vs “carrier pigeon”: no serialization across the boundary until you’re done. We wrote this up here: Speed of Light – MOOLLM (the README has the carrier-pigeon analogy and a 33-turn-in-one-call example).
Speed of Light vs Carrier Pigeon: The fundamental architectural divide in AI agent systems.
https://github.com/SimHacker/moollm/blob/main/designs/SPEED-...
The Core Insight: There are two ways to coordinate multiple AI agents:
Carrier Pigeon
Where agents interact: between LLM calls
Latency: 500 ms+ per hop
Precision: degrades each hop
Cost: high (re-tokenize everything)
Speed of Light
Where agents interact: during one LLM call
Latency: instant
Precision: perfect
Cost: low (one call)
MCP = Carrier Pigeon
Each tool call:
stop generation →
wait for external response →
start a new completion
N tool calls ⇒ N round-trips
MOOLLM Skills and agents can run at the Speed of Light. Once loaded into context, skills iterate, recurse, compose, and simulate multiple agents — all within a single generation. No stopping. No serialization.
Maybe, but that's what I thought while reading the "what actually is async?" part of the post, so I don't think I got biased towards the answer by that point.