We seem to be on a cycle of complexity -> simplicity -> complexity with AI agent design. First we had agents like Manus or Devin that had massive scaffolding around them, then we had simple LLMs in loops, then MCP added capabilities at the cost of context consumption, then in the last month everything has been bash + filesystem, and now we're back to creating more complex tools.
I wonder if there will be another round of simplifications as models continue to improve, or if the scaffolding is here to stay.
Most of the time people sit on complex because they don't have a strong enough incentive to move from something that appears/happen to work, with AI, cost would be a huge incentive.
Hmm the Gemini API doesn’t need MCP for tool-use if I understand correctly. It just needs registered functions
This is what I've been talking about for a few months now. the AI field seems to reinvent the wheel every few months. And because most people really don't know what they're talking about, they just jump on the hype and adopt the new so-called standards without really thinking if it's the right approach. It really annoys me because I have been following some open source projects that have had some genuinely novel ideas about AI agent design. And they are mostly ignored by the community. But as soon as a large company like Anthropic or OpenAI starts a trend, suddenly everyone adopts it.
It's because attention dilution stymies everything. A new chat window in the web app is the smartest the model is ever going to be. Everything you prompt into its context, without sophisticated memory management* makes it dumber. Those big context frameworks are like giving the model a concussion before it does the first task.
*which also pollutes the attention btw; saying "forget about this" doesn't make the model forget about it - it just remembers to forget about it.