Introduction to Model Context Protocol mCP

From: aidotengineer

The Model Context Protocol (mCP) is an open protocol designed to enable seamless integration between AI applications, agents, and external tools and data sources [02:00:00]. It aims to standardize how AI applications interact with these external systems [03:10:00].

Philosophy and Motivation Behind mCP

The core motivation behind mCP is the understanding that models are only as effective as the context provided to them [01:18:00]. Historically, context was manually provided to chatbots through copy-pasting or typing [01:33:00]. Over time, systems evolved to allow models to directly access user data and context, making them more powerful and personalized [01:46:00].

mCP draws inspiration from established protocols:

APIs: Standardized how web applications interact between front-end and back-end, allowing access to servers, databases, and services [02:16:00].
LSP (Language Server Protocol): Standardizes how Integrated Development Environments (IDEs) interact with language-specific tools, enabling an LSP-compatible IDE to work with different coding languages efficiently [02:40:00].

Before mCP, the AI industry faced significant fragmentation in building AI systems [03:47:00]. Teams often created custom implementations for integrating context, prompting logic, and tool/data access [03:55:00]. This led to an N times M problem, with numerous permutations for client applications to interact with servers [06:06:00].

The world with mCP promotes standardized AI development [04:18:00]. An mCP client (e.g., Anthropic’s first-party applications, Cursor, Windsurf, Goose) can connect to any mCP server with zero additional work [04:24:00]. An mCP server acts as a wrapper, federating access to various relevant systems and tools, such as databases, CRMs like Salesforce, or local systems like Git [04:52:00].

Value Proposition of mCP

mCP offers significant benefits to various stakeholders in the AI ecosystem:

Application Developers: Once an application client is mCP compatible, it can connect to any mCP server without additional work [05:42:00].
Tool/API Providers: Developers can build their mCP server once and see it adopted across many AI applications [05:51:00].
End Users: Gain access to more powerful and context-rich AI applications [06:28:00].
Enterprises: Provides a clear way to separate concerns among different teams. For example, one team can own and maintain an mCP server interface for a vector database, allowing other teams to build AI applications faster without constantly needing direct access or custom implementations [06:48:00]. This mirrors the benefits of microservices architectures [07:49:00].

Technical Structure and Features of mCP

mCP standardizes interactions through three primary interfaces [03:17:00]:

Tools

Tools are “model-controlled” [10:27:00]. The server exposes tools to the client application, and the Language Model (LLM) within the client decides when to invoke them [10:37:00]. Descriptions of tool usage are part of the server definition [10:53:00]. Tool possibilities are vast, including:

Read tools to retrieve data [11:05:00].
Write tools to send data or take actions in various systems [11:08:00].
Tools to update databases or write files to a local file system [11:15:00].
Tools can be used when it’s ambiguous if they should be invoked, allowing the LLM to decide based on context [16:02:00].

Resources

Resources are “application-controlled” and represent data exposed to the application [11:23:00]. Servers can define and create various data types like images, text files, or JSON objects [11:31:00]. The application then determines how to use these resources [11:45:00]. Resources provide a richer interface for application-server interaction beyond just text [11:50:00]. Use cases include:

Surfacing static or dynamic files [12:03:00].
Client applications sending information (user data, file system context) to the server, which interpolates it into complex data structures and sends it back [12:11:00].
In Cloud for Desktop, resources appear as attachments that users can select and send to the chat, or models can automatically attach them if relevant to the task [12:24:00].
Resources in mCP can be dynamic, interpolated with context from the user or application [20:58:00].
Resource notifications allow clients to subscribe to a resource and be notified by the server when it’s updated with new information [21:17:17].

Prompts

Prompts are “user-controlled” and are considered tools that the user invokes directly, as opposed to the model [12:59:00]. They are predefined templates for common interactions with a specific server [13:08:00].

An example is Slash commands in an IDE like Zed, where a user types a command (e.g., /GPR for a PR summary) and the server interpolates a longer, predefined prompt to the LLM [13:14:00].
Prompts can standardize common interactions, such as document Q&A or specific data formatting rules [13:51:00].
Similar to resources, prompts can also be dynamic and customized based on the task at hand [21:01:00].

mCP provides more control to different parts of the system—model, application, and user—rather than just the model [22:30:00].

Adoption and Real-World Examples

mCP has seen significant adoption, becoming a frequent topic in discussions with Anthropic’s customers [08:05:00].

Applications and IDEs: Provides a way for users coding in an IDE to give context to the AI, allowing agents to interact with external systems like GitHub or documentation sites [08:21:00].
Server-Side Development: Over 1,100 community-built mCP servers have been open-sourced [08:44:00]. Companies like Cloudflare and Stripe have also published official integrations [50:30:00].
Open Source Contributions: There’s active contribution to the core protocol and its infrastructure layer [09:07:07].

Demonstration: Cloud for Desktop

In a demonstration, Cloud for Desktop (an mCP client) interacts with GitHub and Asana through mCP servers [22:52:00]. The user provides a GitHub repo URL and asks Claude to pull issues and triage them [23:13:00]. Claude automatically invokes a list issues tool, summarizes the issues, and intelligently prioritizes them based on previous interactions [23:32:00]. When asked to add issues to an Asana project, Claude uses installed Asana server tools (list workspaces, search projects) to find the project and add tasks [24:14:00].

Key takeaways from this example:

The Asana and GitHub servers were community-built with just a few hundred lines of code, primarily for surfacing tools [24:52:00].
Multiple distinct tools and systems seamlessly interplay within the Cloud for Desktop interface, which acts as a central dashboard for daily tasks [25:07:00].
Other applications like Windsurf and Goose also use mCP, sometimes calling tools “extensions,” demonstrating the protocol’s adaptability to different application layers and UIs [25:52:00].

Demonstration: mCP Agent Framework

The mCP agent framework, an open-source tool by Last Mile AI, illustrates how mCP integrates with agent systems [30:21:00]. This framework allows defining different sub-agents for a larger task [31:17:00].

A “research agent” is defined as an expert web researcher, given access to Brave for web searching, a fetch tool for data retrieval, and the file system [31:22:00].
A “fact checker agent” uses the same tools to verify information [32:06:00].
A “research report writer agent” synthesizes data and produces a report, with access only to the file system and fetch tools [32:21:00].

The mCP agent framework lets the overall orchestrator agent (which plans and tracks steps) leverage these specialized sub-agents [32:55:00]. mCP acts as an abstraction layer, allowing agent builders to focus on the task and agent interactions rather than managing individual servers or tools [33:48:00]. It enables agents to interface with community-built, authoritative servers in a standardized, declarative way [34:55:00].

mCP as a Foundational Protocol for Agents

mCP is positioned to become a foundational protocol for agents [26:38:00] due to its protocol features and the increasing effectiveness of agent systems and models [26:54:00].

Augmented LLMs and the Agent Loop

The concept of an “augmented LLM” involves an LLM that takes inputs, produces outputs, and uses its intelligence to decide on actions, but is augmented with access to retrieval systems, tools, and memory [27:29:00]. This allows the LLM to query and write data, invoke tools, respond to results, and maintain state across interactions [27:49:00].

mCP fits in as the entire bottom layer, federating access and making it easier for LLMs to interact with these systems in a standardized way [28:10:00]. This means agents can expand their capabilities even after initialization, discovering different interactions with the world without being pre-programmed [28:34:00].

Agent systems fundamentally involve an augmented LLM running in a loop, performing tasks, working towards goals, invoking tools, and responding to results until the task is complete [28:57:00]. mCP provides these capabilities to the augmented LLM in an open way, allowing agent builders to focus on the core loop, context management, and model usage, while users can customize agent interactions with their own data [29:18:00].

Sampling

Sampling is a powerful, though underutilized, mCP capability that allows an mCP server to request LLM inference calls (completions) from the client [53:49:00]. This means the server doesn’t need to implement its own LLM interaction or host an LLM [54:02:00].

The client controls all LLM interactions, including hosting, model choice, privacy, and cost parameters [54:55:00].
The server can request inference with various parameters (model preferences, system/task prompts, temperature, max tokens), though the client retains discretion to fulfill the request [55:12:00].
This is crucial because clients and servers often interact without prior knowledge of each other, yet servers may still need to request intelligence [55:56:00].

Composability

Composability means that the distinction between a client and a server is logical, not physical; any application, API, or agent can act as both an mCP client and an mCP server [56:21:00]. This enables chaining interactions, where, for example, a user talks to a client, which then calls an agent (acting as both client and server), which in turn invokes other servers (file system, web search) [56:40:00]. Data then flows back through the chain to the user [57:13:00].

This allows for building complex, multi-layered architectures where each layer specializes in a particular task [57:28:00]. The combination of sampling and composability is particularly exciting for agent systems, allowing for hierarchical agents where sampling requests are federated through layers back to the main application that controls the LLM interaction [01:11:41].

Future Developments and Roadmap for mCP

The roadmap for mCP focuses on enhancing its capabilities and ease of use.

Remote Servers and Authentication

A significant development is the support for remote servers and OAuth 2.0 authentication [01:13:22].

This allows mCP servers to live on public URLs and be discoverable [01:15:12].
The server orchestrates the OAuth handshake, with the client opening the authentication flow in a browser, and the server holding the OAuth token. Subsequent interactions use a session token provided to the client [01:14:22].
This removes developer friction, as users don’t need to understand mCP hosting or building; servers simply exist like websites [01:15:40].
It enables agents and LLMs to run on completely different systems from where the server is hosted, while maintaining privacy, security, and control [01:15:28].

mCP Registry

To address the current fragmentation and lack of a centralized way to discover and pull in mCP servers, an official mCP registry API is under development [01:22:00].

This will be a unified and hosted metadata service, built in the open, with its schema and development fully transparent [01:23:30].
It will sit above existing package systems (npm, pip, Java, Rust, Go) where mCP servers are deployed [01:22:52].
The registry will solve problems like discovering server protocols (standard IO vs. SSSE), whether a server is local or URL-based, and verifying who built it (e.g., official verification by companies like Shopify) [01:23:10].
It will also support versioning, allowing tracking of changes to tools or tool descriptions [01:24:01]. Companies can host their own registries similar to Artifactory [01:24:45].

For agents, an mCP server registry allows for “self-evolving” agents [01:35:59]. Agents can dynamically discover new capabilities and data on the fly, without needing prior programming [01:36:04]. For example, a coding agent could search the registry for a verified Grafana server if it needs to check logs, install or invoke it (even remotely), and proceed with the task [01:36:16]. This empowers agents to find and choose their own tools, making the augmented LLM system even more powerful [01:37:04].

Well-Known mCP

A complement to the registry is the concept of a .well-known/mcp.json endpoint [01:39:24]. A service like Shopify could host this file at shopify.com/.well-known/mcp.json, providing a verified interface detailing its mCP endpoint, resource capabilities, and OAuth 2.0 authentication [01:39:30]. This allows agents to directly discover and use mCP capabilities for known services [01:40:07].

This approach complements computer vision-based “computer use” models (e.g., Anthropic’s model released in October) [01:40:36]. For systems with well-defined APIs, mCP provides a predefined way to call them. For the “longtail” of systems without APIs, computer use models can click around and interact with UIs [01:41:03]. The future vision is one where both coexist within a single agent [01:41:11].

Medium-Term Considerations

Stateful vs. Stateless Connections: Exploring more short-lived connections where clients can disconnect and reconnect without losing context, potentially bifurcating basic client-initiated requests from advanced server-initiated behaviors (like sampling or notifications that require long-lived connections) [01:41:41].
Streaming: Developing first-class support in the protocol for streaming multiple chunks of data from the server to the client over time [01:42:41].
Namespacing: Addressing conflicts when multiple servers expose tools with the same name, potentially through registry support, allowing logical grouping of tools, or introducing protocol-level namespacing [01:42:51].
Proactive Server Behavior: Refining patterns for server-initiated actions, where the server decides to ask the user for more information or notify them based on events or deterministic systems [01:43:31].

Tubegraph

Explorer

Table of Contents