MCP Tool Search

Problem: Your MCP servers are eating the context window before the conversation begins. Seven servers gets you down to 60-90K usable tokens out of 200K. The hard tasks die before you've typed a word.

Quick Win: Claude Code now flips MCP Tool Search on for you the moment your tool definitions cross 10% of the context window. Nothing to enable. Run /context and you'll see the new headroom.

What is MCP Tool Search?

Tool definitions used to load at session start, all of them, every time. The new system swaps that out for a small search index and pulls full tool details only when Claude reaches for them.

Before MCP Tool Search:

Starting session...
Loading 73 MCP tools... [39.8k tokens]
Loading 56 agents... [9.7k tokens]
Loading system tools... [22.6k tokens]
Ready with 92k tokens remaining.

After MCP Tool Search:

Starting session...
Loading tool registry... [5k tokens]
Ready with 195k tokens available.

User: "I need to query the database"
> Auto-loading: postgres-mcp [+1.2k tokens]
> 193.8k tokens remaining

For anyone running multiple servers, the headline number is a 95% reduction in startup context spend.

How MCP Tool Search Works

The trigger is automatic. Cross the 10% mark on tool description tokens and the lazy path kicks in. From there:

Registry Creation: Claude Code builds a lightweight index of tool names and descriptions
On-Demand Loading: Tools load only when Claude determines they're needed for your request
Intelligent Caching: Loaded tools stay available for the session duration
Same UX: MCP tools work exactly as before - no workflow changes required

Your prompt gets read for keywords. Only the tools that look like a fit get pulled in. Everything else stays on the bench.

For MCP Server Builders

Building your own server? The server instructions field is doing real work now. With MCP Tool Search on, this is what tells Claude when to come looking.

Treat them like skill descriptions. They name the capability and the trigger words:

{
  "mcpServers": {
    "my-custom-server": {
      "command": "node",
      "args": ["/path/to/server.js"],
      "serverInstructions": "Database operations for PostgreSQL including queries, schema management, and data migrations. Use for any database-related tasks."
    }
  }
}

Good server instructions should:

Describe the server's capabilities clearly
Include keywords users might use in prompts
Specify when the tools should be activated
Stay short, but cover the full surface

Checking Your Context Usage

Two slash commands tell you what's loaded and what isn't:

# Check current context usage
/context
 
# See which MCP tools are loaded
/mcp

You'll notice the starting token count drops sharply once lazy loading is active. As Claude pulls tools in for real work, the count climbs, but only by what you actually use.

Configuration Options

The defaults handle most setups. When you want a different behavior, the settings file and a couple of slash commands give you the levers.

Enable or Disable Tool Search

Flip the global switch from your Claude Code settings:

{
  "enable_tool_search": true
}

Set enable_tool_search to false if you prefer all MCP tools loaded at session start (legacy behavior).

Per-Server Control

Disable for specific servers (if you always need certain tools immediately):

/mcp disable tool-search my-always-needed-server

Force load specific tools when you know you'll need them:

Load the github and postgres MCP tools for this session

Real-World Impact

The numbers come straight out of GitHub issue #7336, the bug report that started this whole feature:

Resource	Before	After
MCP tools	39.8k tokens (19.9%)	~5k tokens (2.5%)
Available context	92k tokens	195k tokens
Usable for work	46%	97.5%

Builders running dense setups, including database servers, GitHub integrations, browser automation, and custom APIs, can finally run a full workload without bumping the ceiling.

Compatibility Notes

Every existing MCP server keeps working. A few caveats are worth knowing:

Older servers: May work less efficiently if they lack good tool descriptions
Custom servers: Add clear serverInstructions for best results
High-frequency tools: Consider disabling lazy loading for servers you use constantly

What This Enables

With 95% more context to play with, the practical wins look like this:

Run longer, more complex coding sessions
Use more MCP servers simultaneously without penalty
Maintain conversation history across extended workflows
Execute multi-step tasks that previously hit context limits

Next Steps

Get the most out of your MCP setup:

Audit your servers: Run /context to see your current usage
Update server instructions: Add descriptive instructions to custom servers
Explore more servers: Check our popular MCP servers guide - you can now run more without penalty
Learn MCP fundamentals: Review MCP basics if you're new to the protocol

The context tax was the biggest brake on stacking MCP servers. Lazy loading turns that brake off. Wire up the servers you want, and the runtime keeps the window honest. If you'd rather skip the trial-and-error on which servers to wire together, ClaudeFast's Code Kit ships a curated MCP setup alongside its 18-agent system, so you start the first session already tuned.

MCP Tool Search

On this page