Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

84 points • by knowsuchagency • today at 5:18 AM • 55 comments • view on HN

Every MCP server injects its full tool schemas into context on every turn — 30 tools costs ~3,600 tokens/turn whether the model uses them or not. Over 25 turns with 120 tools, that's 362,000 tokens just for schemas.

mcp2cli turns any MCP server or OpenAPI spec into a CLI at runtime. The LLM discovers tools on demand:

    mcp2cli --mcp https://mcp.example.com/sse --list             # ~16 tokens/tool
    mcp2cli --mcp https://mcp.example.com/sse create-task --help  # ~120 tokens, once
    mcp2cli --mcp https://mcp.example.com/sse create-task --title "Fix bug"

No codegen, no rebuild when the server changes. Works with any LLM — it's just a CLI the model shells out to. Also handles OpenAPI specs (JSON/YAML, local or remote) with the same interface.

Token savings are real, measured with cl100k_base: 96% for 30 tools over 15 turns, 99% for 120 tools over 25 turns.

It also ships as an installable skill for AI coding agents (Claude Code, Cursor, Codex): `npx skills add knowsuchagency/mcp2cli --skill mcp2cli`

Inspired by Kagan Yilmaz's CLI vs MCP analysis and CLIHub.

https://github.com/knowsuchagency/mcp2cli

Comments

jancurn • today at 8:53 AM

Cool, adding this to my list of MCP CLIs:

  - https://github.com/apify/mcpc
  - https://github.com/chrishayuk/mcp-cli
  - https://github.com/wong2/mcp-cli
  - https://github.com/f/mcptools
  - https://github.com/adhikasp/mcp-client-cli
  - https://github.com/thellimist/clihub
  - https://github.com/EstebanForge/mcp-cli-ent
  - https://github.com/knowsuchagency/mcp2cli
  - https://github.com/philschmid/mcp-cli
  - https://github.com/steipete/mcporter
  - https://github.com/mattzcarey/cloudflare-mcp
  - https://github.com/assimelha/cmcp

➕ show 2 replies

vicchenai • today at 11:10 AM

the token math is compelling but I'm curious about the discovery step. with native MCP the host already knows what tools exist. with this, the agent has to run --list first, which means extra roundtrips. for 120 tools that might still be a net win, but the latency tradeoff seems worth calling out

Doublon • today at 7:54 AM

We had `curl`, HTTP and OpenAPI specs, but we created MCP. Now we're wrapping MCP into CLIs...

➕ show 3 replies

devrimozcay • today at 10:37 AM

This looks useful.

One pattern we've been seeing internally is that once teams standardize API interactions through a single interface (or agent layer), debugging becomes both easier and harder.

Easier because there's a central abstraction, harder because failures become more opaque.

In production incidents we often end up tracing through multiple abstraction layers before finding the real root cause.

Curious if you've built anything into the CLI to help with observability or tracing when something fails.

stephantul • today at 7:15 AM

Tokens saved should not be your north star metric. You should be able to show that tool call performance is maintained while consuming fewer tokens. I have no idea whether that is the case here.

As an aside: this is a cool idea but the prose in the readme and the above post seem to be fully generated, so who knows whether it is actually true.

➕ show 2 replies

kristopolous • today at 11:04 AM

Cool to see this!

I started a similar project in January but but nobody seemed interested in it at the time.

Looks like I'll get back on that.

https://github.com/day50-dev/infinite-mcp

Essentially

(1) start with the aggregator mcp repos: https://github.com/day50-dev/infinite-mcp/blob/main/gh-scrap... . pull all of them down.

(2) get the meta information to understand how fresh, maintained, and popular the projects are (https://github.com/day50-dev/infinite-mcp/blob/main/gh-get-m...)

(3) try to extract one-shot ways of loading it (npx/uvx etc) https://github.com/day50-dev/infinite-mcp/blob/main/gh-one-l...

(4) insert it into what I thought was qdrant but apparently I was still using chroma - I'll change that soon

(5) use a search endpoint and an mcp to seach that https://github.com/day50-dev/infinite-mcp/blob/main/infinite...

The intention is to get this working better and then provide it as a free api and also post the entire qdrant database (or whatever is eventually used) for off-line use.

This will pair with something called a "credential file" which will be a [key, repo] pair. There's an attack vector if you don't pair them up. (You could have an mcp server for some niche thing, get on the aggregators, get fake stars, change the the code to be to a fraud version of a popular mcp server, harvest real api keys from sloppy tooling and MitM)

Anyway, we're talking about 1000s of documents at the most, maybe 10,000. So it's entirely givable away as free.

If you like this project, please tell me. Your encouragement means a lot to me!

I don't want to spend my time on things that nobody seems to be interested in.

benvan • today at 8:08 AM

Nice project! I've been working on something very similar here https://github.com/max-hq/max

It works by schematising the upstream and making data locally synchronised + a common query language, so the longer term goals are more about avoiding API limits / escaping the confines of the MCP query feature set - i.e. token savings on reading data itself (in many cases, savings can be upwards of thousands of times fewer tokens)

Looking forward to trying this out!

DieErde • today at 7:44 AM

Why is the concept of "MCP" needed at all? Wouldn't a single tool - web access - be enough? Then you can prompt:

    Tell me the hottest day in Paris in the
    coming 7 days. You can find useful tools
    at www.weatherforadventurers.com/tools

And then the tools url can simply return a list of urls in plain text like

    /tool/forecast?city=berlin&day=2026-03-09 (Returns highest temp and rain probability for the given day in the given city)

Which return the data in plain text.

What additional benefits does MCP bring to the table?

➕ show 6 replies

nwyin • today at 7:14 AM

cool!

anthropic mentions MCPs eating up context and solutions here: https://www.anthropic.com/engineering/code-execution-with-mc...

I built one specifically for Cognition's DeepWiki (https://crates.io/crates/dw2md) -- but it's rather narrow. Something more general like this clearly has more utility.

acchow • today at 9:50 AM

> Every MCP server injects its full tool schemas into context on every turn

I consider this a bug. I'm sure the chat clients will fix this soon enough.

Something like: on each turn, a subagent searches available MCP tools for anything relevant. Usually, nothing helpful will be found and the regular chat continues without any MCP context added.

➕ show 3 replies

tern • today at 8:43 AM

There are a handful of these. I've been using this one: https://github.com/smart-mcp-proxy/mcpproxy-go

jofzar • today at 8:11 AM

How is this the 5th one of these I have seen this week, is everyone just trying to make the same thing?

➕ show 1 reply

Intermernet • today at 8:57 AM

I may be showing my ignorance here, but wouldn't the ideal situation be for the service to use the same number of tokens no matter what client sent the query?

If the service is using more tokens to produce the same output from the same query, but over a different protocol, than the service is a scam.

➕ show 1 reply

ejoubaud • today at 8:33 AM

How does this differ from mcporter? https://github.com/steipete/mcporter/

philipp-gayret • today at 7:24 AM

Someone had to do it. mcp in bash would make them composable, which I think is the strongest benefit for high capability agents like Claude, Cursor and the like, who can write Bash better than I. Haven't gotten into MCP since early release because of the issues you named. Nice work!

silverwind • today at 7:42 AM

How would the LLM exactly discover such unknown CLI commands?

➕ show 1 reply

rakamotog • today at 10:22 AM

For a typical B2B SaaS usecase (non technical employees) -> MCP is working great since its allows people to work in Chat interfaces (ChatGPT, Claude). They will not move to terminal UX's anytime soon.

So, I dont see why a typical productivity app build CLI than MCP. Am I missing anything?

jkisiel • today at 8:01 AM

How is it different from 'mcporter', already included in eg. openclaw?

Ozzie_osman • today at 7:50 AM

I kind of feel like it might be better to go from CLI to MCP.

ekianjo • today at 10:17 AM

Doubtful that a 16 tokens summary is the same as she JSON tool description that uses 10x more tokens. The JSON will describe parameters in a longer way and that has probably some positive impact on accuracy

tuananh • today at 7:53 AM

mcp just need to add dynamic tools discovery and lazy load them, that would solve this token problem right?

rvz • today at 7:51 AM

MCP itself is a flawed standard to being with as I said before [0] and its wraps around an API from the start.

You might as well directly create a CLI tool that works with the AI agents which does an API call to the service anyway.

[0] https://news.ycombinator.com/item?id=44479406

techpulse_x • today at 8:30 AM

[dead]

yogin16 • today at 9:14 AM

[dead]

liminal-dev • today at 7:31 AM

This post and the project README are obviously generated slop, which personally makes me completely skip the project altogether, even if it works.

If you want humans to spend time reading your prose, then spend time actually writing it.

alt Hacker News

Show HN: Mcp2cli – One CLI for every API, 96-99% fewer tokens than native MCP

Comments