Every MCP tool definition consumes context window space, even before Claude uses it. With 10+ MCP servers and dozens of tools, definitions alone can eat 10K-25K tokens.
The Problem
10 MCP servers x 5 tools each = 50 tool definitions
Each definition ~200-500 tokens = 10,000-25,000 tokens of context gone
The Solution
Claude Code supports deferred tools that load only when needed. Instead of preloading all schemas, Claude sees only tool names. When it needs one, it fetches the full schema on demand.
Best Practice
- Keep your most-used tools (3-5) loaded normally for zero-latency access
- Defer specialized tools that are only needed occasionally
- Use descriptive tool names so Claude can find the right one via search
- Cap large tool outputs to protect your context:
export MAX_MCP_OUTPUT_TOKENS=5000
Tip
You don’t need to name MCP tools explicitly in your prompts. Describe your intent in natural language and Claude will select the right tool and craft the call with correct parameters.