Not to be all captain hindsight, but I was puzzled as I was skimming the post, as this seemed obvious to me:
Something is async when it takes longer than you're willing to wait without going off to do something else.
IMO feels sorta like Simon Willison's definition of agents. "LLMs in a loop with a goal" feels super obvious, but not sure if I would have described it that way in hindsight
that's the user-facing definition but the implementation distinction matters more.
"takes longer than you're willing to wait" describes the UX, not the architecture. the engineering question is: does the system actually free up the caller's compute/context to do other work, or is it just hiding a spinner?
nost agent frameworks i've worked with are the latter - the orchestrator is still holding the full conversation context in memory, burning tokens on keep-alive, and can't actually multiplex. real async means the agent's state gets serialized, the caller reclaims its resources, and resumption happens via event - same as the difference between setTimeout with a polling loop vs. actual async/await with an event loop.