MCP advanced

Use Deferred Tool Loading to Save Context

When you have many MCP servers configured, deferred tools load on-demand instead of preloading all definitions, saving precious context window space.

March 15, 2026

Every MCP tool definition consumes context window space, even before Claude uses it. With 10+ MCP servers and dozens of tools, definitions alone can eat 10K-25K tokens.

The Problem

10 MCP servers x 5 tools each = 50 tool definitions
Each definition ~200-500 tokens = 10,000-25,000 tokens of context gone

The Solution

Claude Code supports deferred tools that load only when needed. Instead of preloading all schemas, Claude sees only tool names. When it needs one, it fetches the full schema on demand.

Best Practice

Keep your most-used tools (3-5) loaded normally for zero-latency access
Defer specialized tools that are only needed occasionally
Use descriptive tool names so Claude can find the right one via search
Cap large tool outputs to protect your context:

export MAX_MCP_OUTPUT_TOKENS=5000

Tip

You don’t need to name MCP tools explicitly in your prompts. Describe your intent in natural language and Claude will select the right tool and craft the call with correct parameters.

Paste into Claude Code

Check how many MCP tools are loaded and how much context they're using. Suggest which tools to defer to save context window space.