MCP Tool Search
Lazy loading for MCP tool definitions in Claude Code, gated by a 10% context-window threshold.
Problem: Your MCP servers are eating the context window before the conversation begins. Seven servers gets you down to 60-90K usable tokens out of 200K. The hard tasks die before you've typed a word.
Quick Win: Claude Code now flips MCP Tool Search on for you the moment your tool definitions cross 10% of the context window. Nothing to enable. Run /context and you'll see the new headroom.
What is MCP Tool Search?
Tool definitions used to load at session start, all of them, every time. The new system swaps that out for a small search index and pulls full tool details only when Claude reaches for them.
Before MCP Tool Search:
Starting session...
Loading 73 MCP tools... [39.8k tokens]
Loading 56 agents... [9.7k tokens]
Loading system tools... [22.6k tokens]
Ready with 92k tokens remaining.After MCP Tool Search:
Starting session...
Loading tool registry... [5k tokens]
Ready with 195k tokens available.
User: "I need to query the database"
> Auto-loading: postgres-mcp [+1.2k tokens]
> 193.8k tokens remainingFor anyone running multiple servers, the headline number is a 95% reduction in startup context spend.
How MCP Tool Search Works
The trigger is automatic. Cross the 10% mark on tool description tokens and the lazy path kicks in. From there:
- Registry Creation: Claude Code builds a lightweight index of tool names and descriptions
- On-Demand Loading: Tools load only when Claude determines they're needed for your request
- Intelligent Caching: Loaded tools stay available for the session duration
- Same UX: MCP tools work exactly as before - no workflow changes required
Your prompt gets read for keywords. Only the tools that look like a fit get pulled in. Everything else stays on the bench.
For MCP Server Builders
Building your own server? The server instructions field is doing real work now. With MCP Tool Search on, this is what tells Claude when to come looking.
Treat them like skill descriptions. They name the capability and the trigger words:
{
"mcpServers": {
"my-custom-server": {
"command": "node",
"args": ["/path/to/server.js"],
"serverInstructions": "Database operations for PostgreSQL including queries, schema management, and data migrations. Use for any database-related tasks."
}
}
}Good server instructions should:
- Describe the server's capabilities clearly
- Include keywords users might use in prompts
- Specify when the tools should be activated
- Stay short, but cover the full surface
Checking Your Context Usage
Two slash commands tell you what's loaded and what isn't:
# Check current context usage
/context
# See which MCP tools are loaded
/mcpYou'll notice the starting token count drops sharply once lazy loading is active. As Claude pulls tools in for real work, the count climbs, but only by what you actually use.
Configuration Options
The defaults handle most setups. When you want a different behavior, the settings file and a couple of slash commands give you the levers.
Enable or Disable Tool Search
Flip the global switch from your Claude Code settings:
{
"enable_tool_search": true
}Set enable_tool_search to false if you prefer all MCP tools loaded at session start (legacy behavior).
Per-Server Control
Disable for specific servers (if you always need certain tools immediately):
/mcp disable tool-search my-always-needed-server
Force load specific tools when you know you'll need them:
Load the github and postgres MCP tools for this session
Real-World Impact
The numbers come straight out of GitHub issue #7336, the bug report that started this whole feature:
| Resource | Before | After |
|---|---|---|
| MCP tools | 39.8k tokens (19.9%) | ~5k tokens (2.5%) |
| Available context | 92k tokens | 195k tokens |
| Usable for work | 46% | 97.5% |
Builders running dense setups, including database servers, GitHub integrations, browser automation, and custom APIs, can finally run a full workload without bumping the ceiling.
Compatibility Notes
Every existing MCP server keeps working. A few caveats are worth knowing:
- Older servers: May work less efficiently if they lack good tool descriptions
- Custom servers: Add clear
serverInstructionsfor best results - High-frequency tools: Consider disabling lazy loading for servers you use constantly
What This Enables
With 95% more context to play with, the practical wins look like this:
- Run longer, more complex coding sessions
- Use more MCP servers simultaneously without penalty
- Maintain conversation history across extended workflows
- Execute multi-step tasks that previously hit context limits
Next Steps
Get the most out of your MCP setup:
- Audit your servers: Run
/contextto see your current usage - Update server instructions: Add descriptive instructions to custom servers
- Explore more servers: Check our popular MCP servers guide - you can now run more without penalty
- Learn MCP fundamentals: Review MCP basics if you're new to the protocol
The context tax was the biggest brake on stacking MCP servers. Lazy loading turns that brake off. Wire up the servers you want, and the runtime keeps the window honest. If you'd rather skip the trial-and-error on which servers to wire together, ClaudeFast's Code Kit ships a curated MCP setup alongside its 18-agent system, so you start the first session already tuned.
Stop configuring. Start building.
MCP Basics
Model Context Protocol servers are external processes that expose tools, APIs, and services to Claude Code through a shared wire format.
Context7 MCP
Add the Context7 MCP server to Claude Code so every session pulls current library docs at query time rather than guessing from stale training data.