Too Many MCP Tools Make Agents Worse - Here’s How I Fixed It

#agents #ai #llm #mcp

The Model Context Protocol made agent integrations much easier. Instead of writing custom glue code for every system, you can expose tools through MCP and let the agent use them directly. That part is great.

The trouble starts when you scale it.

As your workflows grow, you add more MCP servers. More servers means more tools. And eventually, your agent ends up staring at a giant wall of tool definitions and schemas even though it may only need 3 or 4 of them for the task at hand.

That is not just a token cost problem. It is a reasoning problem.

Why too many tools hurt

By the time the model sees your actual task, it has already had to process system instructions, task instructions, previous tool outputs, tool descriptions, and tool schemas.

So even before it gets to the real job, it is already wading through a lot of unrelated noise.

That shows up in ways most people working with agents have probably seen already. Response times go up. Tool choice gets worse. The model starts exploring the wrong paths. Sometimes it loses context halfway through and confidently reports a half-done task as finished.

The model is not always failing because it is weak. Sometimes we are just making it reason over too much junk.

GitHub ran into this too

GitHub described a similar issue in Copilot. Their solution was not to keep dumping every possible tool into the prompt.

Instead, they improved tool selection by grouping tools into clusters and using embeddings to pre-select the most relevant ones, so the model does not have to reason over hundreds of tools every time.

You can read their post here: How we’re making GitHub Copilot smarter with fewer tools

That is a smart solution. But while working on my own MCP gateway, I hit the problem from a slightly different angle.

Where I hit this in my own project

I have been building remote-mcp-adapter, a gateway that exposes MCP servers over HTTP.

While working on support for upload-type tools, I realized something annoying: my adapter was contributing to context bloat.

For tools that expect file uploads, the adapter can override the original tool and append staged-upload instructions so the tool can still work remotely. That preserves the original semantics, but it also makes the tool definition larger.

So for every upload-capable tool, I was effectively adding even more tokens to tools/list.

That left me with an awkward trade-off. I could keep the full upstream description and preserve semantics, or trim it and save tokens while risking that the model loses useful context.

My first mitigation was pretty simple. I kept only the first sentence of the upstream description and trimmed it aggressively. That helped, but it was still just damage control.

The core problem remained: the client was still being shown too much too early.

The better fix: progressive tool discovery

Then I saw that FastMCP 3.1.0 added Code-Mode support, and that changed the picture.

My adapter uses FastMCP, so I could add support for it as a config toggle.

Code-Mode changes how tools are exposed to the agent. Instead of surfacing every tool definition up front, it gives the model a smaller discovery interface first. The agent can search for relevant tools, inspect schemas only when needed, and then execute the specific tool it actually wants.

That means the model no longer has to reason over everything at once.

It can narrow the search space first and pull in detail only when required.

That is a much cleaner approach than constantly trimming descriptions and hoping the model still understands enough.

Why this matters

This is bigger than shaving a few tokens off the context window.

It means less noise, better tool selection, fewer unnecessary exploratory calls, and more breathing room when you are connecting multiple upstream MCP servers through a single gateway.

For a project like mine, that matters a lot, because the whole point is to make remote MCP setups easier to work with, not to quietly make context bloat worse.

I added this to remote-mcp-adapter

The latest release of remote-mcp-adapter v0.2.0 now includes Code-Mode support.

So now, instead of forcing clients to ingest every tool definition up front, the adapter can let agents progressively discover what they actually need.

That feels like a much more scalable direction for MCP-heavy setups.

Demo

I recorded a short demo here:

And the project is here:

remote-mcp-adapter on GitHub

I’m curious how others are handling MCP tool bloat once multiple servers enter the picture, especially when tool descriptions start getting modified or wrapped at the gateway layer.