AI Engineer

You're Using MCP Wrongly

Darren Lee — Wed, 19 Nov 2025 03:04:09 GMT

TL;DR: Many developers overload their MCP setup by wiring every server and tool into one giant agent, causing context bloat, latency, and flaky behavior. Treating MCP like an architecture problem, pruning tools, grouping them by task, and routing work through small, focused sub‑agents in Claude, GitHub Copilot, and similar IDEs helps keeps context lean and makes AI orchestration faster, safer, and more predictable.

Introduction

Most developers wire up the Model Context Protocol (MCP) by dumping every tool from every MCP server into a single agent, then wonder why Claude, Cursor, or other AI IDEs feel slow, distracted, and unreliable.
The reality is that tool definitions alone can consume 20–30% of a large context window, and when combined with context poisoning risks, this “just connect all the tools” habit quietly destroys both performance and safety.

In this guide, you’ll learn: what context bloat actually is in the MCP protocol, why “too many tools” breaks agents, how Claude subagents solve this, and how to design a sub-agent architecture that keeps your MCP setup fast, lean, and secure.

Quick Refresher: What MCP Actually Does

Before debugging context bloat, it helps to restate what the Model Context Protocol is doing inside your stack.

MCP is a JSON-RPC–based protocol that lets AI clients like Claude Desktop or Cursor connect to external capabilities through an MCP server.
An MCP server wraps existing APIs, services, or data (files, databases, SaaS APIs) and exposes them as tools, resources, and prompts that the model can call autonomously.

In other words, MCP is the translation layer between “LLM wants to act” and “real-world systems that actually do the work”, and that translation happens through tool definitions that live inside the model’s context.

For full details on MCP server you can check out the complete guide to Model Context Protocol article that we’ve written earlier.

The Context Window Reality Check

Modern Claude models and IDE agents advertise huge context windows, up to 200K tokens, but most engineers underestimate how quickly MCP tools can eat that up. Every tool your MCP server exposes adds names, descriptions, schemas, examples, and sometimes long-form instructions that the AI needs to read and reason about.

If you attach a heavy MCP server (for example, one that bundles Playwright, scraping, PDF handling, and more), that one MCP server alone can consume 10% of your usable context, even before you add project files, system prompts, or user instructions. If you stack three or four such MCP servers on a single agent and you can easily burn 40K+ tokens just on tool metadata, leaving far less room for the actual task.

This is the foundation of the context bloat and context poisoning problem in MCP-based AI orchestration.

Playwright MCP context side

As an example, let’s take a look at Playwright MCP. Claude Code offers a handy /context command that lets you check how much of the context window has been used. From this, you can see that the Playwright MCP alone takes up nearly 8% of the total context window.

Why “Too Many Tools” Breaks Your Agent

The problem is not just “lots of tokens”. It’s how LLMs use those MCP tools when planning and acting.

Attention Dilution

Language models don’t call tools blindly. They scan the list of available tools and decide which one looks relevant. When you expose 50+ MCP tools, the model must mentally evaluate many irrelevant options for each decision, which adds noise and leads to more indecisive or suboptimal behaviour.

Latency Degradation

Each planning step over a large tool list costs compute and time, so your responses slow down as the MCP tool catalog grows. In multi-step workflows such as deep research agents or multi-stage refactors, this overhead compounds, creating 2–3× slower runs compared to a focused tool set.

Goal Drift

With too many tools visible, the agent often goes off-script: instead of doing the exact thing you asked, sometimes it tries “bonus” actions it thinks might be helpful. You requested a simple scrape; it decides to reformat, summarise, and cross-check with other MCP tools, burning context and time on tasks you never asked for.

How do we solve the MCP context issue? Use sub-agents.

Right now, MCP servers can’t be limited only to sub-agents while hidden from the main agent. However, we can still achieve the same outcome by enabling the tools globally, but having only the sub-agent execute and think through them.

This matters because reasoning about how to use a tool also consumes tokens. Taking Playwright as an example, the main agent doesn’t need to understand how to operate the tool — it just needs to know whether the result is successful. By delegating execution to a sub-agent, all the “tool-use thinking” is contained within that sub-agent, instead of bloating the main agent’s context.

So instead of imagining an MCP setup as one all-knowing agent with every tool, think in terms of a small hierarchy of agents with clearly separated responsibilities:

A lead/orchestrator agent
Handles task planning, decomposition, and routing.
Multiple specialized sub-agents
Each handles focused work — research, analysis, execution, validation — and only loads the tools relevant to its job.

Instead of one agent being aware of 60 tools across multiple MCP servers, you might have four sub-agents, each with only 10–15 tools. This dramatically reduces the context overhead required per agent.

Example Architecture: From Monolith to Sub-Agents

Lead Agent (Orchestrator)

Purpose: Understand the user request, break it into steps, and route tasks.
Tools: Minimal set, often none beyond high-level control and access to a shared memory or queue.

Research Sub-Agent

Tools: Web-scraping and browser MCP servers, file readers, basic resource loaders.
Job: Gather data, fetch content, read documents, then pass structured results back.

Playwright Sub-Agent

Tools: Headless browser automation (Playwright MCP server), DOM inspectors, interaction APIs (click, type, navigate), screenshot and PDF capture, network-log hooks.
Job: Execute deterministic browser actions in a controlled environment, then return structured results.

Validation Sub-Agent

Tools: Test runners, linters, checkers, assertions exposed via dedicated MCP servers.
Job: Verify correctness, run tests, enforce guardrails before results return to the user.

This structure makes the model context protocol feel like a team of AI specialists rather than one overwhelmed intern staring at a wall of tools.

Implementation: Claude Code’s Subagents and Github Copilot’s Custom Agents

Both Claude Code and GitHub Copilot use a similar pattern: they break work into smaller, focused agents with their own roles, configuration, and tool access.

In Claude Code, subagents are first-class implementations of this idea: you can scope them by task or MCP—one for backend microservices, one for frontend UI, another for database migrations—each wired up to different MCP servers. Each subagent has its own context window and configuration file, keeping its working memory small and focused so its reasoning stays clean and isn’t distracted by unrelated tools.

GitHub Copilot’s custom agents follow the same principle. Each one is defined by a lightweight YAML‑plus‑Markdown agent profile (for example, a *.agent.md file in .github/agents/ or your user profile) that sets its role, model, and available tools, effectively acting as a dedicated guide for that persona.

The result in both systems is the same: small, purpose‑built agents that stay sharp, relevant, and tightly scoped to the task at hand.

Managing MCP servers

Due to current limitations, you still need to manually enable or disable MCP servers before handing control over to a sub-agent. As of now, there’s no built-in way to expose MCP tools only to a sub-agent while hiding them from the main agent, though I suspect this is likely a feature that will arrive in the near future.

The process varies depending on the platform:

Claude Code – Use the /mcp command to view and toggle MCP servers.
GitHub Copilot – Edit the mcp.json file and comment out the servers you want to disable.
Cursor MCP – Enable or disable MCP servers directly through the in-app UI settings.

Until tool-scoping becomes native, this manual setup ensures that only the intended sub-agent performs the tool-heavy reasoning, keeping the main agent’s context clean.

When You Don’t Need Sub-Agents

Sub-agents are powerful, but they’re not mandatory for every MCP use case.

If you only use 5–10 tools and your workflows are simple, a single well-configured MCP agent might be enough.
If you lack external memory (database, queue, KV store) to pass data between agents, heavy orchestration may introduce more fragility than value.
When the whole context of the task matters, a single agent would perform better. Since the sub agent will only pass a summary back to the main agent.

In these cases, the main win is still to prune your MCP servers and tools so that only essentials are loaded into the main context.

Takeaways: Design Around Sub-Agents, Not a Giant Tool Wall

The default “connect all MCP servers to one agent” pattern is convenient for demos but wasteful and risky in production.

By designing around Claude subagents and Github custom agents, and a clear sub-agent architecture, you reclaim valuable context, improve latency, and reduce both distraction and tool poisoning risk.

The practical next steps are simple! Do a quick audit on which MCP tools are actually used for which tasks, group them into a handful of purpose-built agents, and let a lean orchestrator coordinate the work.
Do that, and your model context protocol setup will feel less like a cluttered toolbox and more like a well-run AI engineering team that knows exactly which MCP server to call, when, and why.

The complete guide to Model Context Protocol (MCP)

Darren Lee — Thu, 13 Nov 2025 03:38:29 GMT

TLDR; Think of Model Context Protocol (MCP) as a “universal language” that lets different AI tools understand and work with each other. If done correctly, MCP helps AI agents to be more effective at using tools.

Introduction

You’re probably here because you’ve heard the buzz about MCP and all of a sudden, every tool you use seems to have its own MCP server. But what is MCP? How does MCP work? And like every developer would ask: “How do I build my own MCP server?” You’re in the right place. I’ve gone down the MCP (Model Context Protocol) rabbit hole to satisfy my curiosity. I’ve consolidated everything I’ve learned here so you don’t have to.

Some context on how MCP started and its current state

Here’s how everything started, Model Context Protocol emerged in November 2024 as Anthropic's open standard for connecting AI systems to external data and tools, solving what developers call the "N×M integration problem." Within six months, MCP achieved adoption by OpenAI, Google DeepMind, and Microsoft, amassing over 16,000 servers across the ecosystem. Yet beneath this explosive growth lies a more complex reality: 43% of implementations contain critical security vulnerabilities, enterprises face a steep learning curve, and the protocol itself remains in active evolution.

In this guide, I hope to cuts through the hype to deliver actionable intelligence for developers building real-world MCP implementations.

Before MCP

If you understand the problem MCP solves, its existence makes sense. Developers are used to working with APIs, where they're responsible for preparing the expected input. Take a weather API endpoint—the docs list required parameters like lat, lon, and others. Your job as a developer is to prepare your software to pass exactly what the endpoint expects. Building a weather alert app? You'd load lat and lon from user data, call the API, and return something useful.

!!! Insert diagram

Now imagine you want AI to handle this. You'd need to spell out exactly where to get the data, how to prepare it, and the conditions for each field. You'd repeat this for every single endpoint the AI needs to access. Working with 10 different API providers? Good luck—especially when one endpoint changes and everything breaks.

Why and how does MCP works?

MCP (Model Context Protocol) is a standardized way for APIs to tell AI how to use them—similar to how API documentation tells developers how to use them. The key innovation? It shifts responsibility to the API provider. Instead of developers spelling out every detail for the AI, the weather API itself now tells the AI how to use its endpoints. That's MCP.

A real-life example

Let’s take a look at Figma MCP, to really understand how MCP works.

Let's look at this open-source Figma MCP server. Here's the API endpoint developers had to work with:

  // Taken from Figma-Context-MCP ./src/services/Figma.ts

  /**
   * Get raw Figma API response for specific nodes (for use with flexible extractors)
   */
  async getRawNode(
    fileKey: string,
    nodeId: string,
    depth?: number | null,
  ): Promise {
    const endpoint = `/files/${fileKey}/nodes?ids=${nodeId}${depth ? `&depth=${depth}` : ""}`;
    Logger.log(
      `Retrieving raw Figma node: ${nodeId} from ${fileKey} (depth: ${depth ?? "default"})`,
    );

    const response = await this.request(endpoint);
    writeLogs("figma-raw.json", response);

    return response;
  }

To make this useful, developers needed to build additional modules and components around it—just to get information about a single node.

With MCP: Instant Integration
Connect your AI agent to the Figma MCP server. It instantly knows what to provide to get node information. No extra setup required.

A list of MCP servers

You can get a list of MCP server here or refer to cursor mcp list. But here are a few that I personally find it useful as a developer

Name	URL	Why Useful (For Developers)
Figma	https://figma.com	Figma offers responsive design, advanced component creation, interactive prototyping, and a Dev Mode for precise handoff—making it easier for developers to collaborate with designers, inspect specs, and translate design into production code efficiently .
Task-master	https://mcpmarket.com/task-master	Task-master automates task planning, parsing PRDs into actionable tasks, and integrates directly with MCP servers and AI-driven editors, allowing developers to streamline workflows, automate CI/CD setup, and focus on coding rather than on meta-work .
Context7	https://upstash.com/context7	Context7 plugs up-to-date, version-specific documentation and code examples into your workflow, reducing time spent on debugging and validation, thus accelerating development and lowering technical debt for both new and experienced developers .
Firecrawl	https://firecrawl.dev	Firecrawl enables fast, scalable web scraping and data extraction, supports turning URLs into structured LLM-ready data, automates lead enrichment, supports dynamic content and anti-bot mechanisms, and accelerates agentic AI development for research or production use .

How to use MCP?

Using MCP with Cursor and Other AI Tools

You can use MCP with chat models (ChatGPT, Claude), AI-powered IDEs (GitHub Copilot, Cursor, Claude Code), or directly in your terminal. While setup varies by provider, the process typically follows three steps: configure the MCP, get your token, and enable it.

Setting Up Cursor MCP

Cursor maintains an official Cursor MCP list with pre-configured servers. For any MCP in this list, integration is straightforward:

Open Cursor settings using Ctrl+Shift+P (or Cmd+Shift+P on Mac)
Navigate to the Tools & MCP tab
Click the Add button next to your desired MCP server

Cursor handles the configuration automatically—no manual setup required.

Click install and you’ll see this screen next:

If you expand on the tools, you’ll see exactly what your AI agent (cursor in this case) have access to:

Now let’s get to the fun part!

Building your own MCP with FastMCP

Up to this point, we’ve talked a lot about what is mcp, why mcp and how to use it. Now let’s get our hands dirty. To start off, building an MCP server involves exposing capabilities through tools (functions that perform actions), resources (data providers), and prompts (reusable templates).

What is FastMCP?

FastMCP is a Python framework that makes it easy to build MCP servers. It has become the preferred Python frame for this task due to its significant developer experience improvements.

Why Use FastMCP?

Before MCP existed, integrating AI with APIs was painful:

Developers had to write detailed instructions for every single API endpoint
Each API required custom prompt engineering
When an API changed, everything broke
Managing 10+ different APIs? A maintenance nightmare.

With FastMCP and MCP, your server tells AI agents exactly how to use it—automatically.

Understanding MCP Server Capabilities

FastMCP lets you expose three types of capabilities:

Capability	Decorator	Purpose	Example
Tools	`@mcp.tool()`	Functions that DO things	Generate fortune, send email, fetch data
Resources	`@mcp.resource()`	Data that can be READ	View archives, access files, read stats
Prompts	`@mcp.prompt()`	Templates that GUIDE AI behavior	Personality modes, task instructions

Most MCP servers primarily use Tools. Resources and Prompts are optional extras for specific use cases.

Building Your First FastMCP Server

Step 0: Prerequisite

Install uv (Python package and project manager)
Create a folder with pyproject.toml file

[project]
name = "a-sample-project"
version = "1.0.0"

Step 1: Install FastMCP

uv add fastmcp

After running this you’ll see fastmcp added into your dependencies array in the pyproject.toml file.

Step 2: Create Your MCP Server

Create a file called fortune_teller.py:

from fastmcp import FastMCP

mcp = FastMCP("Fortune Teller 🔮")

# TOOL: Performs an action
@mcp.tool()
def predict_fortune(name: str, category: str = "random") -> str:
    """
    Generate a new fortune prediction.

    Args:
        name: The person's name to personalize the fortune
        category: Type of fortune (career, love, money, or random)

    Returns:
        A personalized fortune message
    """
    import random

    fortunes = {
        "career": [
            f"{name}, your next bug fix will be legendary! 🐛→✨",
            f"{name}, you'll finally understand that legacy codebase! 📚",
        ],
        "love": [
            f"{name}, you'll meet someone special in a GitHub discussion! 💕",
            f"{name}, your code review will spark romance! ❤️",
        ],
        "money": [
            f"{name}, you'll find a forgotten $20 in your jacket! 💰",
            f"{name}, your side project will earn DOZENS of dollars! 💸",
        ],
        "random": [
            f"{name}, your code will compile on the first try! 🎯",
            f"{name}, you'll finally understand async/await! ✨",
        ]
    }

    fortune_list = fortunes.get(category, fortunes["random"])
    return random.choice(fortune_list)

# RESOURCE: Provides readable data
@mcp.resource("fortune://history")
def fortune_history() -> str:
    """View all past fortune predictions."""
    return """
    📜 Fortune History Archive

    Recent fortunes:
    - Alice: Your code will compile on the first try! 🎯
    - Bob: You'll find a forgotten $20 in your jacket! 💰
    - Carol: Your next bug fix will be legendary! 🐛→✨

    Total fortunes predicted: 1,337
    Accuracy rate: 0% (but 100% entertaining!)
    """

# PROMPT: Guides the AI's behavior
@mcp.prompt()
def fortune_teller_mode() -> str:
    """Activate mystical fortune teller personality."""
    return """You are a mystical fortune teller with a quirky sense of humor.

Your role:
- Greet users warmly and ask for their name
- Offer different types of fortunes (career, love, money, random)
- Deliver fortunes with emojis and dramatic flair
- Always end with "The stars have spoken! ✨"

Be entertaining, not accurate!"""

if __name__ == "__main__":
    mcp.run(transport="http", port=8000)

Step 3: Understanding Your Code

Why docstrings and type hints matter:

FastMCP uses Python's introspection to automatically generate a structured manifest that tells AI agents how to use your tools. Here's what happens:

@mcp.tool()
def predict_fortune(name: str, category: str = "random") -> str:
    #   ^^^^^^^^^^^^^^  ^^^^       ^^^^^^^^              ^^^
    #   |               |          |                     |
    #   Tool name       Parameter  Default value         Return type

    """
    Generate a new fortune prediction.
    # ↑ Becomes the tool description

    Args:
        name: The person's name to personalize the fortune
        # ↑ Becomes parameter description for 'name'

        category: Type of fortune (career, love, money, or random)
        # ↑ Becomes parameter description for 'category'
    """

FastMCP converts this into JSON Schema:

{
  "name": "predict_fortune",
  "description": "Generate a new fortune prediction.",
  "inputSchema": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string",
        "description": "The person's name to personalize the fortune"
      },
      "category": {
        "type": "string",
        "description": "Type of fortune (career, love, money, or random)",
        "default": "random"
      }
    },
    "required": ["name"]
  }
}

This manifest is what AI agents read to understand your tools!

Step 4: Run Your FastMCP Server

 uv run fortune_teller.py

You should see this:



                             ╭──────────────────────────────────────────────────────────────────────────────╮                              
                             │                                                                              │                              
                             │                         ▄▀▀ ▄▀█ █▀▀ ▀█▀ █▀▄▀█ █▀▀ █▀█                        │                              
                             │                         █▀  █▀█ ▄▄█  █  █ ▀ █ █▄▄ █▀▀                        │                              
                             │                                                                              │                              
                             │                               FastMCP 2.13.0.2                               │                              
                             │                                                                              │                              
                             │                                                                              │                              
                             │                  🖥  Server name: Fortune Teller 🔮                           │                              
                             │                                                                              │                              
                             │                  📦 Transport:   HTTP                                        │                              
                             │                  🔗 Server URL:  http://127.0.0.1:8000/mcp                   │                              
                             │                                                                              │                              
                             │                  📚 Docs:        https://gofastmcp.com                       │                              
                             │                  🚀 Hosting:     https://fastmcp.cloud                       │                              
                             │                                                                              │                              
                             ╰──────────────────────────────────────────────────────────────────────────────╯                              


[11/13/25 10:50:35] INFO     Starting MCP server 'Fortune Teller 🔮' with transport 'http' on http://127.0.0.1:8000/mcp      server.py:2050
INFO:     Started server process [788]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)

Your server is now running at http://localhost:8000/mcp 🔮

Connecting FastMCP to Cursor (Cursor MCP Integration)

Now let's connect your FastMCP server to Cursor so the AI can use your tools.

Setting Up Cursor MCP

Method 1: Using Cursor's MCP Settings

Open Cursor and press Ctrl+Shift+P (or Cmd+Shift+P on Mac)
Navigate to Settings → Tools & MCP
Click New MCP Server
Update the mcp.json

{
  "mcpServers": {
    "fortune-teller": {
      "url": "http://localhost:8000/mcp",
      "name": "Fortune Teller 🔮"
    }
  }
}

Head back to the cursor settings and you’ll see this

How Cursor MCP Uses Your FastMCP Tools

Let's see what happens when you chat with Cursor after connecting your FastMCP server.

Example Conversation

You: "Can you predict my fortune for my career?"

Behind the scenes:

Cursor checks available tools

 Cursor: "User wants a fortune prediction. Let me check my tools..."

 Found: predict_fortune from Fortune Teller 🔮
 Needs: name (required), category (optional)

Cursor plans the call

 Cursor: "I'll call predict_fortune with:
 - name: 'User' (from context)
 - category: 'career' (user specified)"

Cursor calls your FastMCP tool

 {
   "tool": "predict_fortune",
   "arguments": {
     "name": "User",
     "category": "career"
   }
 }

Your FastMCP server responds

 {
   "result": "User, your next bug fix will be legendary! 🐛→✨"
 }

Cursor formats the response

 Cursor: "🔮 User, your next bug fix will be legendary! 🐛→✨"

The Magic: Self-Describing Tools

Your FastMCP server self-describes its capabilities. Cursor doesn't need manual instructions—it automatically knows:

What tools are available
What parameters each tool needs
What types those parameters should be
Which parameters are required vs optional
What each parameter means (from your docstrings)

Testing you FastMCP Server inside Cursor

Start a new cursor window and open up a new cursor chat. Try asking “Can you predict my fortune for my career”?

As you can see, the cursor mcp knows you is aware of your MCP can calls the right tool. Depending on your setting, cursor may ask you for permission before running it. If you accept it, here’s what you’ll see.

Testing Your FastMCP Server

You can test your FastMCP server without Cursor using the FastMCP client:

import asyncio
from fastmcp import Client

client = Client("http://localhost:8000/mcp")

# Test the tool
async def test_tool():
    async with client:
        result = await client.call_tool(
            "predict_fortune", 
            {"name": "Alice", "category": "career"}
        )
        print(f"🔮 {result}")

# Test the resource
async def test_resource():
    async with client:
        history = await client.read_resource("fortune://history")
        print(f"📜 {history}")

# Test the prompt
async def test_prompt():
    async with client:
        prompt = await client.get_prompt("fortune_teller_mode")
        print(f"🎭 {prompt}")

# Run tests
asyncio.run(test_tool())
asyncio.run(test_resource())
asyncio.run(test_prompt())

Run the file with uv run test.py

Output:

🔮 Alice, your next bug fix will be legendary! 🐛→✨
📜 Fortune History Archive...
🎭 You are a mystical fortune teller...

FastMCP Best Practices for AI-Friendly Tools

Quick Rule: If a human developer would struggle to understand your function without comments, AI will too. Make everything explicit through docstrings and type hints!

✅ DO	❌ DON'T
Write Clear Docstrings. Include purpose, parameters (Args), and return value descriptions. AI reads these to understand your tool.	Skip Docstrings. Without docstrings, AI has no context about what your tool does or how to use it properly.
Use Type Hints. Always specify parameter and return types (`name: str`, `age: int`). This tells AI exactly what data types to provide.	Use Vague Function Names. Avoid generic names like `process()` or `calc()`. Use descriptive names like `convert_markdown_to_html()`.
Use Literal for Enums. For parameters with specific valid values, use `Literal["option1", "option2"]` to tell AI the exact allowed choices.	Leave Parameters Untyped. Untyped parameters default to `Any`, making it unclear what the AI should provide.
Use Descriptive Parameter Names. Name parameters clearly: `birth_year` instead of `x`, `email_address` instead of `data`.	Use Single-Letter Variables. Avoid unclear names like `x`, `y`, `data`, or `input` that don't explain what they represent.
Document Edge Cases. Explain constraints in docstrings (e.g., "Cannot divide by zero", "Date must be in ISO format").	Assume AI Knows Context. Don't assume AI understands implicit rules or constraints without documentation.

Conclusion

So there you have it! Model Context Protocol (MCP) in a nutshell. Instead of wrestling with endless API integrations, MCP lets your tools explain themselves to AI agents. If you've made it this far, you've got everything you need to start building. Spin up a FastMCP server, hook it into Cursor MCP, and watch the magic happen. The MCP protocol is still evolving, but with major players like OpenAI and Google backing it, this is clearly where things are headed. Don't overthink it! Start with something simple, see what MCP can do, and iterate from there. The best way to understand what MCP is all about? Build something with it. Your fortune-telling server is waiting!