The MCP Tool Prefix Trap: How Inherited Tools Silently Break Your Cache


Every time Claude Code spawns a subagent, it sends the full tool definitions for every MCP server your session has loaded. Every. Single. Spawn.

If you have nine MCP servers running — totally normal for anyone using Claude desktop with a few integrations — that’s 7,000+ tokens of tool definitions riding along on every agent call. A research subagent that reads files and searches the web is hauling Gmail tool definitions it will never touch.

That alone would be annoying. What makes it a genuine cost trap is where in the prompt those tool definitions land.

The cache hierarchy you’re probably not thinking about

Anthropic’s prompt caching works by exact byte-prefix matching. The incoming bytes have to match a previous request from position 0, unbroken. The first difference busts everything after it.

Here’s the ordering:

1. Tool definitions       ← changes here bust EVERYTHING below
2. System prompt          ← changes here bust system + messages
3. Message history        ← changes here bust messages only

Tools are at position 1. They are the most upstream content in the entire prefix. One new MCP tool, one removed server, one reordered definition — and the cache misses on tools, the system prompt, and the entire conversation history.

This is documented in the Anthropic prompt caching docs (as of 2026-04-14) and covered in detail in Prompt Caching Basics, but most people read past it. They focus on “keep your system prompt stable” without realizing there’s a more volatile layer sitting above it.

What you’re actually shipping per spawn

Here’s a typical Claude desktop install with a few integrations:

MCP serverExample
Claude_PreviewBuilt-in, Claude desktop
Claude_in_ChromeChrome extension
mcp-registryBuilt-in
scheduled-tasksBuilt-in
Gmail / SlackVia a Cowork integration
Accessibility checkerCustom
OpenTelemetry monitorCustom
Linear / GitHub / SupabasePick one or two

Each server registers tool definitions. Those definitions accumulate in the tool prefix for every API call your session makes — including every subagent spawn.

Before optimization (all MCP inherited): ~7,000 tokens of tool definitions per spawn

After optimization (built-ins only): ~1,500 tokens of tool definitions per spawn

Savings: ~5,500 tokens × every spawn in your pipeline

(These numbers are approximate, based on a representative session measurement with that MCP configuration. Your counts will vary — as of 2026-04-14.)

At Sonnet 4.5 rates ($3/M input, $0.30/M cached — as of 2026-04-14), a pipeline that spawns 6 agents saves roughly $0.10 per run just on tool prefix tokens. That’s before counting the cache-busting effect. If those 7,000 tokens are different enough per spawn to prevent a cache hit, you’re paying full input price on everything downstream.

The fix: one line in agent frontmatter

When you create a custom subagent at .claude/agents/your-agent.md, the default behavior is to inherit all tools from the parent session — including every MCP server loaded in your desktop app or ~/.claude/.mcp.json.

The Claude Code subagent docs (as of 2026-04-14) expose a tools: frontmatter field that overrides this. Set it explicitly and the agent gets only what you list.

Before (default — inherits everything):

---
name: research-agent
description: Research subagent for competitive analysis
model: inherit
---

After (explicit — stable, minimal prefix):

---
name: research-agent
description: Research subagent for competitive analysis
model: inherit
tools: Read, Write, Edit, Bash, Glob, Grep, Agent, WebSearch, WebFetch, TodoWrite
---

The built-in tool list for Claude Code is stable across sessions. No MCP server churn, no surprise additions when you install a new integration. The prefix stays identical spawn after spawn — which means it caches.

For agents that genuinely need an MCP server, list those specific tools explicitly rather than inheriting all:

---
name: ux-reviewer
description: Visual and accessibility review agent
model: inherit
tools: Read, Glob, Grep, TodoWrite, mcp__a11y-accessibility__check_color_contrast, mcp__a11y-accessibility__test_accessibility
---

This is also better security hygiene: a read-only reviewer has no business calling the Supabase write tools, and now it structurally can’t.

The compounding win: cross-agent cache sharing

Here’s where it gets interesting.

Two subagents with identical tools: lists share the same tool prefix. The first spawn pays to build the cache; every subsequent spawn — whether the same agent type or a different one — can hit it.

Consider a pipeline that spawns six agents:

Coach → [Research, Competitive, Content, SEO, WebDev, UX]

With inherited MCP tools:

Each agent type inherits whatever happens to be loaded in the parent session. If any agents have different mcpServers: overrides, or if server ordering differs, they each get a unique tool prefix. Six spawns, six cold starts on tools.

With explicit, shared tools: lists:

# All six agents have this same tools: line
tools: Read, Write, Edit, Bash, Glob, Grep, Agent, WebSearch, WebFetch, TodoWrite

The first spawn builds the cache on that tool prefix. Spawns 2–6 hit it. The tool prefix caches once per pipeline run instead of six times.

The fewer differences across agent definitions, the more the cache is shared. A common minimal base list with role-specific additions on top is the pattern to aim for.

One way to audit your current setup

If you want per-request token attribution across agents, OTel + Jaeger gives you that. For a quick sense of how many tools your session is loading, this one-liner covers the basics:

# Count MCP tools in your current session (run from Claude Code's bash)
claude --print "Count the total number of tools available in this session, broken down by MCP server. Output as a table." 2>/dev/null

Or check your ~/.claude/.mcp.json and desktop app settings to see which servers are registered. Each server typically registers 5–15 tool definitions.

A quick checklist before you ship agents

  • Every .claude/agents/ file has an explicit tools: line
  • The base list matches across all agents (shared prefix = shared cache)
  • Role-specific MCP tools are listed individually, not via inherited-all
  • No per-agent mcpServers: overrides unless that server is genuinely required by that agent

The caveat you need to know

There’s a known bug in Claude Code (#29966) where subagent caching doesn’t currently behave as expected. Understanding what each agent actually receives in context is covered in Skills vs Rules vs Reads — the two posts pair well. The structural optimization described here is correct — minimizing and stabilizing the tool prefix is the right move regardless — but the full cache-sharing benefit across agents may not activate until Anthropic resolves the bug.

Set this up now, get the immediate savings from smaller tool prefixes per spawn, and be positioned to benefit from full cross-agent cache sharing once the fix ships.

Also worth noting: the interaction between mcpServers: (the field for overriding which servers an agent can access) and tools: (the field for whitelisting specific tool names) is not fully documented as of 2026-04-14. If you set both fields, test the behavior in your specific setup rather than assuming one takes precedence over the other.

Over-restricting tools: is a real footgun: if an agent tries to call a tool not in its list, it fails. Start with a generous built-in list and trim only after you’ve confirmed the agent works.

The 60-second audit

If you’re running any multi-agent pipeline in Claude Code right now, open your agent files and check:

  1. Does every file have tools: in the frontmatter? If not, it inherits everything.
  2. Are the tools: lists consistent across agents? Inconsistency kills cross-agent sharing.
  3. Are you listing MCP tools individually or relying on server-level inheritance? Individual listing keeps the prefix stable when server configs change.

The fix is genuinely one line per agent file. The sessions you’ve already run are already paid for. Every session from here is cheaper.