HeadlinesBriefing favicon HeadlinesBriefing.com

mcp2cli: Slashes LLM Token Costs with On-Demand Tool Discovery

Hacker News •
×

mcp2cli transforms how LLMs interact with tools by converting OpenAPI specs and MCP servers into runtime CLIs, eliminating upfront schema injection. Instead of bloating system prompts with 3,600+ tokens per turn for 30 tools, it lets models discover commands via lightweight `--list` (16 tokens/tool) and `--help` (80-200 tokens) interactions. For 120 tools over 25 turns, this reduces context costs by 99%, saving ~362,000 tokens compared to traditional methods.

Unlike Anthropic's Tool Search, which injects full JSON schemas (~121 tokens/tool) when queried, mcp2cli's CLI interface returns human-readable summaries, cutting discovery costs to ~16 tokens/tool. It supports both MCP servers (via HTTP/SSE) and OpenAPI specs (local/remote JSON/YAML), enabling seamless integration with any LLM. Cached specs with TTL control further optimize performance, while `--toon` encoding reduces output tokens by 40-60% for LLMs.

The tool requires no codegen or recompilation—point it at a spec URL and the CLI exists instantly. Updates to servers or APIs appear immediately on next invocation. Installation as an AI agent skill (`npx skills add knowsuchagency/mcp2cli --skill mcp2cli`) lets Claude Code, Cursor, and Codex leverage this efficiency. Benchmarks using cl100k_base show 96% savings for 30 tools over 15 turns.

By avoiding schema bloat and enabling precise tool discovery, mcp2cli addresses a critical bottleneck in LLM tooling. It turns CLIHub's insights into a production-ready solution, proving that on-demand access—not upfront loading—is key to scalable agent architectures. For developers, this means leaner prompts, lower costs, and more responsive AI systems.