> Context Bloat: Using a skill often requires loading the entire SKILL.md into the LLM’s context ...

nextaccountic • today at 7:21 AM • 2 replies • view on HN

> Context Bloat: Using a skill often requires loading the entire SKILL.md into the LLM’s context window, rather than just exposing the single tool signature it needs. It’s like forcing someone to read the entire car’s owner’s manual when all they want to do is call car.turn_on().

MCP has severe context bloat just by starting a thread. If harnesses were smart enough to, during install time, summarize the tools provided by a MCP server (rather than dumping the whole thing in context), it would be better. But a worse problem is that the output of MCP goes straight into the context of the agent, rather than being piped somewhere else

A solution is to have the agent run a cli tool to access mcp services. That way the agent can filter the output with jq, store it in a file for analysis later, etc

Replies

gum_wobble • today at 7:28 AM

> A solution is to have the agent run a cli tool to access mcp services.

lol and why do you need mcp for that, why cant that be a classic http request then?

mathis-l • today at 7:36 AM

At least when working with local MCP servers I solved this problem by wrapping the mcp tools inside an in-memory cache/store. Each tool output gets stored under a unique id and the id is returned with the tool output. The agent can then invoke other tools by passing the id instead of generating all the input. Adding attribute access made this pretty powerful (e.g. pass content under tool_return_xyz.some.data to tool A as parameter b). This saves token costs and is a lot faster. Granted, it only works for passing values between tools but I could imagine an additional tool to pipe stuff into the storage layer would solve this.

alt Hacker News

Replies