Those issues are considered artifacts of the current crop of LLMs in academic circles; there is already research allowing LLMs to use millions of different tools at the same time, and stable long contexts, likely reducing the amount of agents to one for most use cases outside interfacing different providers.
Anyone basing their future agentic systems on current LLMs would likely face LangChain fate - built for GPT-3, made obsolete by GPT-3.5.
How would "a million different tool calls at the same time" work? For instance, MCP is HTTP based, even at low latency in incredibly parallel environments that would take forever.
yes, but those aren’t released and even then you’ll always need glue code.
you just need to knowingly resource what glue code is needed, and build it in a way it can scale with whatever new limits that upgraded models give you.
i can’t imagine a world where people aren’t building products that try to overcome the limitations of SOTA models
> already research allowing LLMs to use millions of different tools
Hmm first time hearing about this, could you share any examples please?
Can you link to the research on millions of different terms and stable long contexts? I haven't come across that yet.