How I Finally Understood MCP — and Got It Working in Real Life

-

  1. : Why I Wrote This
  2. The Evolution of Tool Integration with LLMs
  3. What Is Model Context Protocol (MCP), Really?
  4. Wait, MCP feels like RAG… but is it?
    1. In an MCP-based setup
    2. In a standard RAG system
    3. Traditional RAG Implementation
    4. MCP Implementation
  5. Quick recap!
  6. Core Capabilities of an MCP Server
  7. Real-World Example: Claude Desktop + MCP (Pre-built Servers)
  8. Construct Your Own: Custom MCP Server from Scratch
  9. 🎉 Congrats, You’ve Mastered MCP!
  10. References

: Why I Wrote This

I will likely be honest. After I first saw the term “Model Context Protocol (mcp),” I did what most developers do when confronted with one more recent acronym: I skimmed a tutorial, saw some JSON, and quietly moved on. “Too abstract,” I believed. Fast-forward to after I actually tried to integrate some custom tools with Claude Desktop— something that needed memory or access to external tools — and suddenly, MCP wasn’t just relevant. It was essential.

The issue? Not one of the tutorials I got here across felt beginner-friendly. Most jumped straight into constructing a custom MCP server without explaining in details you’d need a server in the primary place — let alone mentioning that prebuilt MCP servers exist already and work out of the box. So, I made a decision to learn it from the bottom up.

I read all the things I could, experimented with each prebuilt and custom servers, integrated it with Claude Desktop and tested whether I could explain it to my friends —individuals with zero prior context. After I finally got the nod from them, I knew I could break it down for anyone, even in the event you’ve never heard of MCP until five minutes ago.

This text breaks down what MCP is, why it matters, and the way it compares to other popular architectures like RAG. We’ll go from “what even is that this?” to spinning up your personal working Claude integration — no prior MCP knowledge required. For those who’ve ever struggled to get your AI model to feel a bit less like a goldfish, that is for you.

The Evolution of Tool Integration with LLMs

Before diving into MCP, let’s understand the progression of how we connect Large Language Models (LLMs) to external tools and data:

Image by creator
  1. Standalone LLMs: Initially, models like GPT and Claude operated in isolation, relying solely on their training data. They couldn’t access real-time information or interact with external systems.
  2. Tool Binding: As LLMs advanced, developers created methods to “bind” tools on to models. For instance, with LangChain or similar frameworks, you may do something like:
llm = ChatAnthropic()
augmented_llm = llm.bind_tools([search_tool, calculator_tool])

This works well for individual scripts but doesn’t scale easily across applications. Because tool binding in frameworks like LangChain is often designed around single-session, stateless interactions, meaning each time you spin up a brand new agent or function call, you’re often re-defining which tools it could access. There’s no centralized approach to manage tools across multiple interfaces or user contexts.

3. Application Integration Challenge: The true complexity arises when you wish to integrate tools with AI-powered applications like IDEs (Cursor, VS Code), chat interfaces (Claude Desktop), or other productivity tools. Each application would wish custom connectors for each possible tool or data source, making a tangled web of integrations.

That is where MCP enters the image — providing a standardized layer of abstraction for connecting AI applications to external tools and data sources.

What Is Model Context Protocol (MCP), Really?

Let’s break it down:

  • Model: The LLM at the center of your application — GPT, Claude, whatever. It’s a strong reasoning engine but limited by what it was trained on and the way much context it could hold.
  • Context: The additional information your model must do its job — documents, search results, user preferences, recent history. Context extends the model’s capabilities beyond its training set.
  • Protocol: A standardized way of communicating between components. Consider it as a standard language that lets your model interact with tools and data sources in a predictable way.

Put those three together, and MCP becomes a framework that connects models to contextual information and tools through a consistent, modular, and interoperable interface.

Very similar to HTTP enabled the online by standardizing how browsers check with servers, MCP standardizes how AI applications interact with external data and capabilities.


Pro tip! A simple approach to visualize MCP is to think about it like tool binding for all the AI stack, not only a single agent. That’s why Anthropic describes MCP as “a USB-C port for AI applications.”

Image by creator, inspired by Understanding MCP From Scratch by LangChain

Wait, MCP feels like RAG… but is it?

Lots of people ask, “How is that this different from RAG?” Great query.

At a look, each MCP and RAG aim to resolve the identical problem: give language models access to relevant, external information. But how they do it — and the way maintainable they’re — differs significantly.

In an MCP-based setup

  • Your AI app (host/client) connects to an MCP document server
  • You interact with context using a standardized protocol
  • You possibly can add recent documents or tools without modifying the app
  • All the things works via the identical interface, consistently
Image by creator, inspired by MCP Documentation.

In a standard RAG system

  • Your app manually builds and queries a vector database
  • You regularly need custom embedding logic, retrievers, and loaders
  • Adding recent sources means rewriting a part of your app code
  • Every integration is bespoke, tightly coupled to your app logic

The important thing distinction is abstraction: The in Model Context Protocol is nothing but a that defines bidirectional communication between MCP Client/Host and MCP Servers.

Image by creator, inspired by MCP Documentation.

MCP gives your app the flexibility to ask, “Give me details about X,” without knowing that info is stored or retrieved. RAG systems require your app to administer all of that.

With MCP, your application logic stays the identical, at the same time as your document sources evolve.

Let’s take a look at some high-level codes to see how these approaches differ:

Traditional RAG Implementation

In a standard RAG implementation, your application code directly manages connections to document sources:

# Hardcoded vector store logic
vectorstore = FAISS.load_local("store/embeddings")
retriever = vectorstore.as_retriever()
response = retriever.invoke("query about LangGraph")

With tool binding, you define tools and bind them to an LLM, but still need to switch the tool implementation to include recent data sources. You continue to must update the tool implementation when your backend changes.

@tool
def search_docs(query: str):
    return search_vector_store(query)

MCP Implementation

With MCP, your application connects to a standardized interface, and the server handles the specifics of document sources:

# MCP Client/Host: Client/Host stays the identical

# MCP Server: Define your MCP server
# Import vital libraries
from typing import Any
from mcp.server.fastmcp import FastMCP

# Initialize FastMCP server
mcp = FastMCP("your-server")

# Implement your server's tools
@mcp.tool()
async def example_tool(param1: str, param2: int) -> str:
    """An example tool that demonstrates MCP functionality.
    
    Args:
        param1: First parameter description
        param2: Second parameter description
    
    Returns:
        A string result from the tool execution
    """
    # Tool implementation
    result = f"Processed {param1} with value {param2}"
    return result

# Example of adding a resource (optional)
@mcp.resource()
async def get_example_resource() -> bytes:
    """Provides example data as a resource.
    
    Returns:
        Binary data that will be read by clients
    """
    return b"Example resource data"

# Example of adding a prompt template (optional)
mcp.add_prompt(
    "example-prompt",
    "It is a template for {{purpose}}. You need to use it to {{motion}}."
)

# Run the server
if __name__ == "__main__":
    mcp.run(transport="stdio")

Then, you configure the host or client (like Claude Desktop) to make use of the server by updating its configuration file.

{
    "mcpServers": {
        "your-server": {
            "command": "uv",
            "args": [
                "--directory",
                "/ABSOLUTE/PATH/TO/PARENT/FOLDER/your-server",
                "run",
                "your-server.py"
            ]
        }
    }
}

For those who change where or how the resources/documents are stored, you update the server — not the client.

And for a lot of use cases — especially in production environments like IDE extensions or industrial applications — you touch the client code in any respect. MCP’s decoupling is greater than only a nice-to-have: it’s a necessity. It isolates the applying code in order that only the server-side logic (tools, data sources, or embeddings) must evolve. The host application stays untouched. This allows rapid iteration and experimentation without risking regression or violating application constraints.


Quick recap!

Hopefully, by now, it’s clear why MCP actually matters.

Imagine you’re constructing an AI assistant that should:

  • Tap right into a knowledge base
  • Execute code or scripts
  • Keep track of past user conversations

Without MCP, you’re stuck writing custom glue code for each single integration. Sure, it really works — until it doesn’t. It’s fragile, messy, and a nightmare to keep up at scale.

MCP fixes this by acting as a universal adapter between your model and the surface world. You possibly can plug in recent tools or data sources without rewriting your model logic. Which means faster iteration, cleaner code, fewer bugs, and AI applications which can be actually modular and maintainable.

And I hope you were being attentive after I said MCP enables bidirectional communication between the host (client) and the server — because this unlocks certainly one of MCP’s strongest use cases: persistent memory.

Out of the box, LLMs are goldfish. They forget all the things unless you manually stuff all the history into the context window. But with MCP, you’ll be able to:

  • Store and retrieve past interactions
  • Keep track of long-term user preferences
  • Construct assistants that really “remember” full projects or ongoing sessions

No more clunky prompt-chaining hacks or fragile memory workarounds. MCP gives your model a brain that lasts longer than a single chat.

Core Capabilities of an MCP Server

With all that in mind, it’s pretty clear: the MCP server is the MVP of the entire protocol.

It’s the central hub that defines the capabilities your model can actually use. There are three predominant types:

  • Resources: Consider these as external data sources — PDFs, APIs, databases. The model can pull them in for context, but it could’t change them. Read-only.
  • Tools: These are the actual functions the model can call — run code, search the online, generate summaries, you name it.
  • Prompts: Predefined templates that guide the model’s behavior or structure its responses. Like giving it a playbook.

What makes MCP powerful is that every one of those are exposed through a single, consistent protocol. Which means the model can request, invoke, and incorporate them with no need custom logic for every one. Just plug into the MCP server, and all the things’s able to go.

Real-World Example: Claude Desktop + MCP (Pre-built Servers)

Out of the box, Anthropic offers a bunch of pre-built MCP servers you’ll be able to plug into your AI apps — things like Claude Desktop, Cursor, and more. Setup is super quick and painless.

For the total list of accessible servers, head over to the MCP Servers Repository. It’s your buffet of ready-to-use integrations.

On this section, I’ll walk you thru a practical example: extending Claude Desktop so it could read out of your computer’s file system, write recent files, move them around, and even search through them.

This walkthrough relies on the Quickstart guide from the official docs, but truthfully, that guide skips just a few key details — especially in the event you’ve never touched these settings before. So I’m filling within the gaps and sharing the additional suggestions I picked up along the approach to prevent the headache.

1. Download Claude Desktop

First things first — grab Claude Desktop. Select the version for macOS or Windows (sorry Linux folks, no support just yet).

Follow the installation steps as prompted.

Have already got it installed? Be sure you’re on the most recent version by clicking the Claude menu in your computer and choosing “Check for Updates…”

2. Check the Prerequisites

You’ll need Node.js installed in your machine to get this running easily.

To envision in the event you have already got Node installed:

  • On macOS: Open the Terminal out of your Applications folder.
  • On Windows: Press Windows + R, type cmd, and hit Enter.
  • Then run the next command in your terminal:
node --version

For those who see a version number, you’re good to go. If not, head over to nodejs.org and install the most recent LTS version.

3. Enable Developer Mode

Open Claude Desktop and click on on the “Claude” menu within the top-left corner of your screen. From there, select Help.

On macOS, it should look something like this:

Image by creator

From the drop-down menu, select “Enable Developer Mode.”

For those who’ve already enabled it before, it won’t show up again — but when that is your first time, it must be right there within the list.

Once Developer Mode is turned on:

  1. Click on “Claude” within the top-left menu again.
  2. Select “Settings.”
  3. A brand new pop-up window will appear — search for the “Developer” tab within the left-hand navigation bar. That’s where all the good things lives.
Image by creator

4. Set Up the Configuration File

Still within the Developer settings, click on “Edit Config.”

This can create a configuration file if one doesn’t exist already and open it directly in your file system.

The file location will depend on your OS:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%Claudeclaude_desktop_config.json

That is where you’ll define the servers and capabilities you would like Claude to make use of — so keep this file open, we’ll be editing it next.

Image by creator

Open the config file (claude_desktop_config.json) in any text editor. Replace its contents with the next, depending in your OS:

For macOS:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/username/Desktop",
        "/Users/username/Downloads"
      ]
    }
  }
}

For Windows:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "C:UsersusernameDesktop",
        "C:UsersusernameDownloads"
      ]
    }
  }
}

Be sure to interchange "username" together with your actual system username. The paths listed here should point to valid folders in your machine—this setup gives Claude access to your Desktop and Downloads, but you’ll be able to add more paths if needed.

What This Does

This config tells Claude Desktop to mechanically start an MCP server called "filesystem" each time the app launches. That server runs using npx and spins up @modelcontextprotocol/server-filesystem, which is what lets Claude interact together with your file system—read, write, move files, search directories, etc.

⚠️ Command Privileges

Only a heads-up: Claude will run these commands together with your user account’s permissions, meaning it could access and modify local files. Only add commands to the config file in the event you understand and trust the server you’re hooking up — no random packages from the web!

5. Restart Claude

When you’ve updated and saved your configuration file, restart Claude Desktop to use the changes.

After it boots up, you need to see a hammer icon within the bottom-left corner of the input box. That’s your signal that the developer tools — and your custom MCP server — are up and running.

Image by creator

After clicking the hammer icon, you need to see the list of tools exposed by the Filesystem MCP Server — things like reading files, writing files, searching directories, and so forth.

Image by creator

For those who don’t see your server listed or nothing shows up, don’t worry. Hop over to the Troubleshooting section within the official documentation for some quick debugging tricks to get things back on target.

6. Try It Out!

Now that all the things’s arrange, you’ll be able to start chatting with Claude about your file system — and it should know when to call the suitable tools.

Listed below are just a few things you’ll be able to try asking:

When needed, Claude will mechanically invoke the suitable tools and ask in your approval before doing anything in your system. You stay on top of things, and Claude gets the job done.

Construct Your Own: Custom MCP Server from Scratch

Alright, able to level up?

On this section, you’ll go from user to builder. We’re going to put in writing a custom MCP server that Claude can check with — specifically, a tool that lets it search the most recent documentation from AI libraries like LangChain, OpenAI, MCP (yes, we’re using MCP to learn MCP), and LlamaIndex.

Because let’s be honest — how repeatedly have you ever watched Claude confidently spit out deprecated code or reference libraries that haven’t been updated since 2021?

This tool uses real-time search, scrapes live content, and provides your assistant fresh knowledge on demand. Yes, it’s as cool because it sounds.

The project is built using the official MCP SDK from Anthropic. For those who’re comfortable with Python and the command line, you’ll be up and running very quickly. And even in the event you’re not — don’t worry. We’ll walk through all the things step-by-step, including the parts most tutorials just assume you already know.

Prerequisites

Before we dive in, listed below are the belongings you need installed in your system:

  • Python 3.10 or higher — that is the programming language we’ll use
  • MCP SDK (v1.2.0 or higher) — this offers you all of the tools to create a Claude-compatible server (which will likely be installed in upcoming parts)
  • uv (package manager) — consider it like a contemporary version of pip, but much faster and easier to make use of for projects (which will likely be installed in upcoming parts)

Step 1: Install uv (the Package Manager)

I

On macOS/Linux:

curl –LsSf https://astral.sh/uv/install.sh | sh

On Windows:

powershell –ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

This can download and install uv in your machine. Once it’s done, close and reopen your terminal to ensure that the uv command is recognized. (For those who’re on Windows, you need to use WSL or follow their Windows instructions.)

To envision that it’s working, run this command in your terminal:

uv --version

For those who see a version number, you’re good to go.

Step 2: Set Up Your Project

Now we’re going to create a folder for our MCP server and get all of the pieces in place. In your terminal, run these commands:

# Create and enter your project folder
uv init mcp-server
cd mcp-server

# Create a virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activate  # Windows: .venvScriptsactivate

Wait — what’s all this?

  • uv init mcp-server sets up a blank Python project named mcp-server .
  • uv venv creates a virtual environment (your private sandbox for this project).
  • source .venv/bin/activate activates that environment so all the things you put in stays inside it.

Step 3: Install the Required Packages

Inside your virtual environment, install the tools you’ll need:

uv add "mcp[cli]" httpx beautifulsoup4 python-dotenv

Here’s what each package does:

  • mcp[cli]: The core SDK that permits you to construct servers Claude can check with
  • httpx: Used to make HTTP requests (like fetching data from web sites)
  • beautifulsoup4: Helps us extract readable text from messy HTML
  • python-dotenv: Lets us load API keys from a .env file

Before we start writing code, it’s idea to open the project folder in a text editor so you’ll be able to see all of your files in a single place and edit them easily.

For those who’re using VS Code (which I highly recommend in the event you’re undecided what to make use of), just run this from inside your mcp-server folder:

code .

This command tells VS Code to open the current folder (. just means “right here”).

code:

Cmd+Shift+PCtrl+Shift+P

Shell Command: Install 'code' command in PATH

mcp-server

Step 3.5: Get Your Serper API Key (for Web Search)

To power our real-time documentation search, we’ll use Serper — a straightforward and fast Google Search API that works great for AI agents.

Here’s learn how to set it up:

  1. Head over to serper.dev and click on Sign Up:
    It’s free for basic usage and works perfectly for this project.
  2. Once signed in, go to your Dashboard:
    You’ll see your API Key listed there. Copy it.
  3. In your project folder, create a file called .env:
    That is where we’ll store the important thing securely (so we’re not hardcoding it).
  4. Add this line to your .env file:
SERPER_API_KEY=your-api-key-here

Replace your-api-key-here with the actual key you copied

That’s it — now your server can check with Google via Serper and pull in fresh docs when Claude asks.

Step 4: Write the Server Code

Now that your project is ready up and your virtual environment is running, it’s time to really write the server.

This server goes to:

  • Accept an issue like:
  • Know which documentation site to go looking (e.g., LangChain, OpenAI, etc.)
  • Use an online search API (Serper) to seek out the perfect links from that site
  • Visit those pages and scrape the actual content
  • Return that content to Claude

That is what makes your Claude smarter — it could look things up from real docs as an alternative of creating things up based on old data.


⚠️ Quick Reminder About Ethical Scraping

At all times respect the positioning you’re scraping. Use this responsibly. Avoid hitting pages too often, don’t scrape behind login partitions, and check the positioning’s robots.txt file to see what’s allowed. You possibly can read more about it here.

Your tool is barely as useful because it is respectful. That’s how we construct AI systems that should not just smart — but sustainable too.


1. Create Your Server File

First, run this from inside your mcp-server folder to create a brand new file:

touch predominant.py

Then open that file in your editor (if it isn’t open already). Replace the code there with the next:

from mcp.server.fastmcp import FastMCP
from dotenv import load_dotenv
import httpx
import json
import os
from bs4 import BeautifulSoup
load_dotenv()

mcp = FastMCP("docs")

USER_AGENT = "docs-app/1.0"
SERPER_URL = "https://google.serper.dev/search"

docs_urls = {
    "langchain": "python.langchain.com/docs",
    "llama-index": "docs.llamaindex.ai/en/stable",
    "openai": "platform.openai.com/docs",
    "mcp": "modelcontextprotocol.io"
}

async def search_web(query: str) -> dict | None:
    payload = json.dumps({"q": query, "num": 2})
    headers = {
        "X-API-KEY": os.getenv("SERPER_API_KEY"),
        "Content-Type": "application/json",
    }

    async with httpx.AsyncClient() as client:
        try:
            response = await client.post(
                SERPER_URL, headers=headers, data=payload, timeout=30.0
            )
            response.raise_for_status()
            return response.json()
        except httpx.TimeoutException:
            return {"organic": []}
        except httpx.HTTPStatusError as e:
            print(f"HTTP error occurred: {e}")
            return {"organic": []}
  
async def fetch_url(url: str) -> str:
    async with httpx.AsyncClient(headers={"User-Agent": USER_AGENT}) as client:
        try:
            response = await client.get(url, timeout=30.0)
            response.raise_for_status()
            soup = BeautifulSoup(response.text, "html.parser")
            
            # Attempt to extract predominant content and take away navigation, sidebars, etc.
            main_content = soup.find("predominant") or soup.find("article") or soup.find("div", class_="content")
            
            if main_content:
                text = main_content.get_text(separator="n", strip=True)
            else:
                text = soup.get_text(separator="n", strip=True)
                
            # Limit content length if it's too large
            if len(text) > 8000:
                text = text[:8000] + "... [content truncated]"
                
            return text
        except httpx.TimeoutException:
            return "Timeout error when fetching the URL"
        except httpx.HTTPStatusError as e:
            return f"HTTP error occurred: {e}"

@mcp.tool()  
async def get_docs(query: str, library: str) -> str:
    """
    Search the most recent docs for a given query and library.
    Supports langchain, openai, mcp and llama-index.

    Args:
        query: The query to go looking for (e.g. "Chroma DB")
        library: The library to go looking in (e.g. "langchain")

    Returns:
        Text from the docs
    """
    if library not in docs_urls:
        raise ValueError(f"Library {library} not supported by this tool. Supported libraries: {', '.join(docs_urls.keys())}")
    
    query = f"site:{docs_urls[library]} {query}"
    results = await search_web(query)
    
    if not results or len(results.get("organic", [])) == 0:
        return "No results found"
    
    combined_text = ""
    for i, lead to enumerate(results["organic"]):
        url = result["link"]
        title = result.get("title", "No title")
        
        # Add separator between results
        if i > 0:
            combined_text += "nn" + "="*50 + "nn"
            
        combined_text += f"Source: {title}nURL: {url}nn"
        page_content = await fetch_url(url)
        combined_text += page_content
    
    return combined_text


if __name__ == "__main__":
    mcp.run(transport="stdio")

2. How The Code Works

First, we arrange the inspiration of our custom MCP server. It pulls in all of the libraries you’ll need — like tools for making web requests, cleansing up webpages, and loading secret API keys. It also creates your server and names it "docs" so Claude knows what to call. Then, it lists the documentation sites (like LangChain, OpenAI, MCP, and LlamaIndex) your tool will search through. Finally, it sets the URL for the Serper API, which is what the tool will use to send Google search queries. Consider it as prepping your workspace before actually constructing the tool.

Click here to see the revelant code snippet
from mcp.server.fastmcp import FastMCP
from dotenv import load_dotenv
import httpx
import json
import os
from bs4 import BeautifulSoup
load_dotenv()

mcp = FastMCP("docs")

USER_AGENT = "docs-app/1.0"
SERPER_URL = "https://google.serper.dev/search"

docs_urls = {
    "langchain": "python.langchain.com/docs",
    "llama-index": "docs.llamaindex.ai/en/stable",
    "openai": "platform.openai.com/docs",
    "mcp": "modelcontextprotocol.io"
}

Then, we define a function that lets our tool check with the Serper API, which we’ll use as a wrapper around Google Search.

This function, search_web, takes in a question string, builds a request, and sends it off to the search engine. It includes your API key for authentication, tells Serper we’re sending JSON, and limits the variety of search results to 2 for speed and focus. The function returns a dictionary containing the structured results, and it also gracefully handles timeouts or any errors which may come from the API. That is the part that helps Claude determine to look before we even fetch the content.

Click here to see the relevant code snippet
async def search_web(query: str) -> dict | None:
    payload = json.dumps({"q": query, "num": 2})
    headers = {
        "X-API-KEY": os.getenv("SERPER_API_KEY"),
        "Content-Type": "application/json",
    }

    async with httpx.AsyncClient() as client:
        try:
            response = await client.post(
                SERPER_URL, headers=headers, data=payload, timeout=30.0
            )
            response.raise_for_status()
            return response.json()
        except httpx.TimeoutException:
            return {"organic": []}
        except httpx.HTTPStatusError as e:
            print(f"HTTP error occurred: {e}")
            return {"organic": []}

Once we’ve found just a few promising links, we want a approach to extract just the useful content from those web pages. That’s what fetch_url does. It visits each URL, grabs the total HTML of the page, after which uses BeautifulSoup to filter out just the readable parts—things like paragraphs, headings, and examples. We attempt to prioritize sections like

,

, or containers with a .content class, which often hold the good things. If the page is super long, we also trim it right down to avoid flooding the output. Consider this because the “reader mode” for Claude—it turns cluttered webpages into clean text it could understand.

Click here to see the relevant code snippet
async def fetch_url(url: str) -> str:
    async with httpx.AsyncClient(headers={"User-Agent": USER_AGENT}) as client:
        try:
            response = await client.get(url, timeout=30.0)
            response.raise_for_status()
            soup = BeautifulSoup(response.text, "html.parser")
            
            # Attempt to extract predominant content and take away navigation, sidebars, etc.
            main_content = soup.find("predominant") or soup.find("article") or soup.find("div", class_="content")
            
            if main_content:
                text = main_content.get_text(separator="n", strip=True)
            else:
                text = soup.get_text(separator="n", strip=True)
                
            # Limit content length if it's too large
            if len(text) > 8000:
                text = text[:8000] + "... [content truncated]"
                
            return text
        except httpx.TimeoutException:
            return "Timeout error when fetching the URL"
        except httpx.HTTPStatusError as e:
            return f"HTTP error occurred: {e}"

Now comes the predominant act: the actual tool function that Claude will call.

The get_docs function is where all the things comes together. Claude will pass it a query and the name of a library (like "llama-index"), and this function will:

  1. Check if that library is supported
  2. Construct a site-specific search query (e.g., site:docs.llamaindex.ai "vector store")
  3. Use search_web() to get the highest results
  4. Use fetch_url() to go to and extract the content
  5. Format all the things right into a nice, readable string that Claude can understand and return

We also include titles, URLs, and a few visual separators between each result, so Claude can reference or cite them if needed.

Click here to see the relevant code snippet
@mcp.tool()  
async def get_docs(query: str, library: str) -> str:
    """
    Search the most recent docs for a given query and library.
    Supports langchain, openai, mcp and llama-index.

    Args:
        query: The query to go looking for (e.g. "Chroma DB")
        library: The library to go looking in (e.g. "langchain")

    Returns:
        Text from the docs
    """
    if library not in docs_urls:
        raise ValueError(f"Library {library} not supported by this tool. Supported libraries: {', '.join(docs_urls.keys())}")
    
    query = f"site:{docs_urls[library]} {query}"
    results = await search_web(query)
    
    if not results or len(results.get("organic", [])) == 0:
        return "No results found"
    
    combined_text = ""
    for i, lead to enumerate(results["organic"]):
        url = result["link"]
        title = result.get("title", "No title")
        
        # Add separator between results
        if i > 0:
            combined_text += "nn" + "="*50 + "nn"
            
        combined_text += f"Source: {title}nURL: {url}nn"
        page_content = await fetch_url(url)
        combined_text += page_content
    
    return combined_text

Finally, this line kicks all the things off. It tells the MCP server to start out listening for input from Claude using standard input/output (which is how Claude Desktop talks to external tools). This line at all times lives at the underside of your script.

if __name__ == "__main__":
    mcp.run(transport="stdio")

Step 5: Test and Run Your Server

Alright, your server is coded and able to go — now let’s run it and see it in motion. There are two predominant ways to check your MCP server:

Development Mode (Really useful for Constructing & Testing)

The simplest approach to test your server during development is to make use of:

mcp dev predominant.py

This command launches the MCP Inspector, which opens up a neighborhood web interface in your browser. It’s like a control panel in your server.

Image by creator

Here’s what you’ll be able to do with it:

  • Interactively test your tools (like get_docs)
  • View detailed logs and error messages in real time
  • Monitor performance and response times
  • Set or override environment variables temporarily

Use this mode while constructing and debugging. You’ll have the opportunity to see exactly what Claude would see and quickly fix any issues before integrating with the total Claude Desktop app.

Claude Desktop Integration (For Regular Use)

Once your server works and also you’re joyful with it, you’ll be able to install it into Claude Desktop:

mcp install predominant.py

This command will:

  • Add your server into Claude’s configuration file (the JSON file we fiddled with earlier) mechanically
  • Enable it to run each time you launch Claude Desktop
  • Make it available through the Developer Tools (🔨 hammer icon)

⚠️ Current Issue: uv Command Is Hardcoded

Immediately, there’s an open issue within the mcp library: when it writes your server into Claude’s config file, it hardcodes the command as just "uv". That works if uv is globally available in your PATH — which isn’t at all times the case, especially in the event you installed it locally with pipx or a custom method.

So we want to repair it manually. Here’s how:

Manually Update Claude’s Config File
  1. Open your Claude config file:

On MacOS:

code ~/Library/Application Support/Claude/claude_desktop_config.json

On Windows:

code $env:AppDataClaudeclaude_desktop_config.json

💡 For those who’re not using VS Code, replace code together with your text editor of alternative (like open, nano, or subl).

2. Find the section that appears like this:

"docs": {
  "command": "uv",
  "args": [
    "run",
    "--with",
    "mcp[cli]",
    "mcp",
    "run",
    "/PATH/TO/mcp-server/predominant.py"
  ]
}

3. Update the "command" value to absolutely the path of uv in your system.

  • To search out it, run this in your terminal:
which uv
  • It’ll return something like:
/Users/your_username/.local/bin/uv
  • Now replace "uv" within the config with that full path:
"docs": {
  "command": "/Users/your_username/.local/bin/uv",
  "args": [
    "run",
    "--with",
    "mcp[cli]",
    "mcp",
    "run",
    "PATH/TO/mcp-server/predominant.py"
  ]
}

4. Save the file and restart Claude Desktop.

That’s It!

Now Claude Desktop will recognize your custom docs tool, and anytime you open the Developer Tools (🔨), it’ll show up. You possibly can chat with Claude and ask things like:

And Claude will call your server, search the docs, pull the content, and use it in its response — live. You possibly can view a fast demo here.

Image by creator

🎉 Congrats, You’ve Mastered MCP!

You probably did it. You’ve gone from zero to constructing, testing, and integrating your very own Claude-compatible MCP server — and that’s small feat.

Take a moment. Stretch. Sip some coffee. Pat yourself on the back. You didn’t just write some Python — you built an actual, production-grade tool that extends an LLM’s capabilities in a modular, secure, and powerful way.

Seriously, most devs don’t get this far. You now understand:

  • How MCP works under the hood
  • Easy methods to construct and expose tools Claude can use
  • Easy methods to wire up real-time web search and content extraction
  • Easy methods to debug, test, and integrate the entire thing with Claude Desktop

You didn’t just learn it — you shipped it.

Need to go even deeper? There’s a complete world of agentic workflows, custom tools, and collaborative LLMs waiting to be built. But for now?

Take the win. You earned it. 🏆

Now go ask Claude something fun and let your recent tool flex.


References

[1] Anthropic, (2024), modelcontextprotocol.io
[2] LangChain, (2024), Notion
[3] A. Alejandro, (2024), GitHub Repository
[4] O. Santos, (2024), Medium

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x