Tools for Your LLM: a Deep Dive into MCP

technique that may turn LLMs into actual agents. It’s because MCP provides tools to your LLM which it will possibly use to retrieve live information or perform actions in your behalf.

Like all other tools within the toolbox, I imagine that to be able to apply MCP effectively, you’ve got to know it thoroughly. So I approached it in my usual way: get my hands around it, poke it, take it apart, put it back together and get it working again.

The goals of this week:

get a solid understanding of MCP; what’s it?
construct an MCP server and connect it to an LLM
understand when to make use of MCP
explore considerations around MCP

1) What’s MCP?

MCP (Model Context Protocol) is protocol designed to increase LLM clients. An LLM client is anything that runs an LLM: consider Claude, ChatGPT or your individual LangGraph agentic chatbot. In this text we’ll use Claude desktop as a LLM client and construct a MCP server for it that extends its abilities.

First let’s understand what MCP really is.

A helpful analogy

Consider MCP the identical way you’re thinking that of browser extensions. A browser extension adds capabilities to your browser. An MCP server adds capabilities to your LLM. In each cases you provide a small program that the client (browser or LLM) can load and communicate with to make it do more.

This program is named an MCP server and LLM clients can use it to e.g. retrieve information or perform actions.

When is a program an MCP server?

Any program can develop into an MCP server so long as it implements the Model Context Protocol. The protocol defines:

which functions the server must expose (capabilities)
how these functions have to be described (tool metadata)
how the LLM can call them (with JSON request formats)
how the server must respond (with JSON result formats)

An MCP server is any program that follows the MCP message rules. Notice that language, runtime or location don’t matter.

Key capabilities:

declaring tools
accepting a tool call request
executing the requested function
returning a result or error

Example of a tool-call message:

{
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {"city": "Groningen"}
  }
}

Sending this JSON means:

2) Creating an MCP server

Since any program will be an MCP server, let’s create one.

Imagine we work for a cinema and we have the desire to make it possible for agents to assist people buy tickets. This manner a user can resolve which movie to choose by chatting with ChatGPT or instruct Claude to purchase tickets.

After all these LLMs aren’t aware of what’s happening in our cinema so we’ll need to show our cinema’s API through MCP in order that the LLMs can interact with it.

The best possible MCP server

We’ll use fastmcp, a Python package that wraps Python functions in order that they conform to the MCP specifications. We will can “present” this code to the LLM so that they’re aware of the functions and may call them.

from fastmcp import FastMCP

mcp = FastMCP("example_server")

@mcp.tool
def list_movies() -> str:
    """ List the films which might be currently playing """
    # Simulate a GET request to our /movies endpoint
    return ["Shrek", "Inception", "The Matrix", "Lord of the Rings"]

if __name__ == "__main__":
    mcp.run()

The code above defines a server and registers a tool. The docstring and sort hints help fastmcp describe the tool to the LLM client (as required by the MCProtocol). The agent decides based on this description whether the function is suitable in fulfilling the duty it is ready out to do.

Connecting Claude Desktop to the MCP server

To ensure that our LLM to be “aware” of the MCP server, we’ve got to inform it where to search out this system. We register our recent server in Claude Desktop by opening and update claude_desktop_config.json in order that it looks like this:

{
  "mcpServers": {
    "cinema_server": {
      "command": "/Users/mikehuls/explore_mcp/.venv/bin/python",
      "args": [
        "/Users/mikehuls/explore_mcp/cinema_mcp.py"
      ]
    }
  }
}

Now that our MCP server is registered, Claude can use it. It call list_movies() for instance. The functions in registered MCP servers develop into first-class tools that the LLM can resolve to make use of.

Chatting with our agent (image by writer)

As you see, Claude has executed the function from our MCP server and has access to the resulting value. Very easy in only a couple of lines of code.

With a couple of more lines we wrap much more API endpoints in our MCP server and permit the LLM to call functions that show screening times and even allow the LLM to perform actions on our behalf by making a reservation:

Allowing our agent to order a seat (image by writer)

3) When to make use of MCP

MCP is good when:

You wish an LLM to access live data
You wish an LLM to perform actions (create tasks, fetch files, write records)
You would like to expose internal systems in a controlled way
You would like to share your tools with others as a package they’ll plug into their LLM

Users profit because MCP lets their LLM develop into a more powerful assistant.

Providers profit because MCP lets them expose their systems safely and consistently.

A typical pattern is a “tool suite” that exposes backend APIs. As a substitute of clicking through UI screens, a user can ask an assistant to handle the workflow for them.

4) Considerations

Since its release in November 2024, MCP has been widely adopted and quickly became the default strategy to connect AI agents to external systems. But it surely’s not without trade-offs; MCP introduces structural overhead and real security risks, in my view, engineers should pay attention to before using it in prodution.

a) Security

In the event you download an unknown MCP server and connect it to your LLM, you’re effectively granting that server file and network access, access to local credentials and command execution permissions. A malicious tool could:

read or delete files
exfiltrate private data (.ssh keys e.g.)
scan your network
modify production systems
steal tokens and keys

MCP is simply as save because the server you select to trust. Without guardrails you’re mainly giving an LLM full control over your computer. It makes it very easy to over-expose since you possibly can easily add tools.

The browser-extension analogy applies here as well: most are protected but malicious ones can do real damage. Like browser extensions, use trusted sources like verified repositories, inspect source code if possible and sandbox execution if you’re unsure. Implement strict permissions and leas-privilege policies.

b) Inflated context window, token inefficiency and latency

MCP servers describe every tool intimately: names, argument schema’s, descriptions and result formats. The LLM client loads all this metadata up-front into the model context in order that it knows which tools exist and the best way to use it.

Because of this in case your agent uses many tools or complex schemas, the prompt can grow significantly. Not only does this use quite a lot of token, it also uses up remaining space for conversation history and task-specific instructions. Every tool you expose permanently eats a slice of the available context.

Moreover, every tool call introduces reasoning overhead, schema parsing, context reassignment and a full round-trip from model -> MCP client -> MCP server -> back to the model. This is way too heavy for latency-sensitive pipelines.

c) Complexity shifts into the model

The LLM must make all of the tough decisions:

whether to call a tool in any respect
which tool to call
which arguments to make use of

All of this happens contained in the model’s reasoning relatively than through explicit orchestration logic. Although initially this feels magically convenient and efficient, at scale this will likely develop into unpredictable, harder to debug and tougher to ensure deterministically.

Conclusion

MCP is straightforward and powerful at the identical time. It’s a standardized strategy to let LLMs call real programs. Once a program implements MCP, any compliant LLM client can use it as an extension. This opens the door to assistants that may query API’s, perform tasks and interact with real systems in a structured way.

But with great power comes great responsibility. Treat MCP servers with the identical caution as software that has full access to your machine. Its design also introduces implications for token usage, latency and strain on the LLM. These trade-offs may undermine the core advantage of MCP is thought for: turning agents into efficient, real-world tools.

When used intentionally and securely, MCP offers a clean foundation for constructing agentic assistants that may actually things relatively than simply discuss them.

I hope this text was as clear as I intended it to be but when this will not be the case please let me know what I can do to make clear further. Within the meantime, try my other articles on all types of programming-related topics.

Completely happy coding!

— Mike

P.s: like what I’m doing? Follow me!

Tools for Your LLM: a Deep Dive into MCP

1) What’s MCP?

A helpful analogy

When is a program an MCP server?

2) Creating an MCP server

The best possible MCP server

Connecting Claude Desktop to the MCP server

3) When to make use of MCP

4) Considerations

a) Security

b) Inflated context window, token inefficiency and latency

c) Complexity shifts into the model

Conclusion

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Scaling up BERT-like model Inference on modern CPU

Architecting GPUaaS for Enterprise AI On-Prem

Nice-Tune XLSR-Wav2Vec2 for low-resource ASR with 🤗 Transformers

Accelerating PyTorch distributed fine-tuning with Intel technologies

an Interactive Tool for Datasets

Tools for Your LLM: a Deep Dive into MCP

1) What’s MCP?

A helpful analogy

When is a program an MCP server?

2) Creating an MCP server

The best possible MCP server

Connecting Claude Desktop to the MCP server

3) When to make use of MCP

4) Considerations

a) Security

b) Inflated context window, token inefficiency and latency

c) Complexity shifts into the model

Conclusion

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.