From: aidotengineer

The Model Context Protocol (mCP) is an open protocol developed by Anthropic’s applied AI team designed to enable seamless integration between AI applications, agents, tools, and data sources [01:55:00]. It addresses the challenge that models are only as effective as the context provided to them [01:18:00].

Motivation and Philosophy Behind mCP

A year ago, context was primarily brought into AI chatbots by manual copy-pasting or typing [01:33:00]. Over time, these systems evolved to have hooks into user data and context, making them more powerful and personalized [01:42:00]. Anthropic observed significant fragmentation in how companies built AI systems, with different teams creating custom implementations for prompt logic, tool integration, and data access [03:47:00]. This fragmentation led to an “N times M problem,” where client applications and servers had myriad ways of interacting [06:06:00].

mCP aims to standardize how AI applications interact with external systems [03:10:00], providing a layer between application developers and tool/API developers to give LLMs access to data [06:17:00].

Predecessors and Analogies

mCP draws inspiration from previous standardization protocols:

  • APIs: Standardized how web applications interact between the front-end and back-end, allowing access to servers, databases, and services [02:12:12].
  • LSP (Language Server Protocol): Standardizes how IDEs interact with language-specific tools [02:37:40]. An LSP-compatible IDE can interact with any Go LSP server, for example, to understand Go language features [02:50:00].

How mCP Works

mCP standardizes interactions between AI applications and external systems through three primary interfaces [03:14:00]:

1. Tools

Tools are “model-controlled,” meaning the server exposes them to the client application, and the Large Language Model (LLM) within the client decides the optimal time to invoke them [10:27:00].

  • Servers provide descriptions of how tools are used [10:51:00].
  • Tools can perform various actions, including reading data, writing data, updating databases, and managing local file systems [11:05:00].
  • Tools are beneficial when it’s ambiguous if and when they should be invoked by the LLM [16:02:00].

2. Resources

Resources are “application-controlled” data exposed to the application [11:23:00].

  • Servers can define and create static or dynamic resources (e.g., images, text files, JSON) [11:31:00].
  • The client application determines how to use the resource [11:45:00].
  • Resources enable richer interactions beyond simple text-based chat [11:50:00].
  • In applications like Claude for Desktop, resources appear as attachments, allowing users to select and send them to the model [12:24:00]. They can also be automatically attached by the model based on relevance [12:43:00].
  • mCP supports resource notifications, where clients can subscribe to a resource and be notified by the server when it’s updated with new information or context [21:17:00].

3. Prompts

Prompts are “user-controlled” and act as predefined templates for common interactions with a specific server [12:59:00].

  • They are analogous to slash commands in IDEs like Zed, where a short command can interpolate a longer, predefined prompt to the LLM [13:14:00].
  • Prompts can standardize formatting rules or data presentation for common tasks like document Q&A across different teams [13:51:00].
  • Prompts can be dynamic, interpolated with context from the user or application [20:58:00].

Value Proposition of mCP

The adoption of mCP offers several advantages:

  • For Application Developers: Once a client is mCP compatible, it can connect to any server without additional work [05:42:00].
  • For Tool and API Providers: Building an mCP server once allows its adoption across various AI applications [05:51:00].
  • For End Users: Leads to more powerful and context-rich AI applications [06:28:00].
  • For Enterprises: Provides a clear way to separate concerns between teams. For instance, an infrastructure team owning a vector database can turn it into an mCP server, allowing other teams to build AI applications faster without needing direct access or custom integration [06:48:00]. This concept is similar to microservices, where teams own specific services [07:49:00].

Adoption and Real-World Examples

mCP has seen significant traction since its launch [08:05:00].

  • Applications and IDEs: Provides a way for users coding in IDEs to provide context to the IDE, allowing agents within the IDE to interact with external systems like GitHub or documentation sites [08:21:00]. Examples include Cursor and Windsurf [04:31:00].
  • Community Servers: Over 1,100 community-built mCP servers have been published open source [08:47:00]. Companies like Block have launched mCP clients (e.g., Goose) [04:36:00].
  • Official Integrations: Companies like Cloudflare and Stripe have built official mCP integrations [50:30:00].
  • Demo with Claude for Desktop:
    • Claude for Desktop (an mCP client) can be given a GitHub repo URL and tasked with triaging issues [23:13:00].
    • The model (Claude) automatically invokes the “list issues” tool to pull issues into context and summarize them [23:29:00].
    • It intelligently prioritizes issues based on user context [23:43:00].
    • Claude can then triage issues and add them to an Asana project, using tools like “list workspaces” and “search projects” from an Asana mCP server [24:15:00].
    • Many of these servers are community-built and require only a few hundred lines of code [24:52:00].
    • Claude for Desktop acts as a central dashboard for bringing in context from various external systems [25:23:00].

mCP’s Role in Augmented LLM Systems

mCP is envisioned as a foundational protocol for agents broadly [26:36:00]. It supports the concept of an “augmented LLM,” which involves the LLM interacting with retrieval systems, tools, and memory [27:20:00].

Key Capabilities for Agents

  • Federation: mCP can federate and simplify how LLMs interact with retrieval systems, invoke tools, and manage memory in a standardized way [28:10:00].
  • Expandability: Agents can expand their capabilities even after initial programming by discovering new tools and interactions with the world [28:41:00].
  • Agentic Loop: At its core, an agent system is an augmented LLM running in a loop, doing tasks, invoking tools, and responding to results until the task is complete [29:03:00]. mCP provides these capabilities openly, allowing agent builders to focus on the core loop and context management rather than server capabilities or data provision [29:18:18].

mCP Agent Framework Example

The open-source mCP Agent framework, built by Last Mile AI, showcases how agents can interact with mCP [30:21:00].

  • A multi-agent system can be defined in a simple Python file [30:41:00].
  • Example: A research agent tasked with researching Quantum Computing’s impact on cybersecurity, using mCP servers for Brave (web search), Fetch (data retrieval), and File System (file management) [31:17:00].
  • Other agents, like a fact-checker agent and a research report writer agent, can use similar or different mCP servers to complete their sub-tasks [32:05:00].
  • The agent first forms a plan (a series of steps) based on its task and available servers [33:02:00].
  • mCP serves as an abstraction layer, allowing the agent builder to focus on the task and agent interactions rather than the underlying server details [33:48:00].

Sampling

Sampling allows an mCP server to request LLM inference calls (completions) from the client, rather than the server implementing its own LLM interaction [53:52:00].

  • This is useful when a server needs intelligence (e.g., to ask for more user input or formulate questions) [54:46:00].
  • The client retains full control over the LLM interaction, including hosting, model choice, privacy, and cost parameters [54:55:00].
  • Servers can specify preferences for model size or version [55:12:00].
  • This design principle accounts for scenarios where clients have no prior knowledge of a server [55:52:00].

Composability

Composability means that any application, API, or agent can function as both an mCP client and an mCP server [56:21:00].

  • This allows for chaining interactions and building complex, multi-layered architectures of LLM systems, where each layer specializes in a particular task [57:17:00].
  • For example, a user interacts with Claude for Desktop (client), which calls a research agent (mCP server and client), which in turn calls other servers like the file system, fetch, or web search servers [56:40:00].
  • This enables hierarchical systems of agents where each node can have autonomy and intelligence, making decisions on how to pull in richer data [01:01:14].
  • The combination of sampling and composability allows for hierarchical agent systems where sampling requests are federated through layers back to the application controlling the LLM interaction [01:11:41].

Future and Roadmap

Remote Servers and Authentication

Support for OAuth 2.0 has been added to the mCP protocol [01:13:51].

  • This enables remotely hosted servers accessible via a public URL [01:15:09].
  • The server orchestrates the OAuth handshake, handles the callback URL, and stores the OAuth token, providing the client with a session token for future interactions [01:14:22].
  • Remote servers can operate over SSSE (Server-Sent Events) as opposed to standard I/O, removing development friction and making servers discoverable like websites [01:14:04].
  • This significantly boosts the number of available servers as users don’t need to host or build them locally [01:15:37].

Registry and Discoverability

Currently, there’s no centralized way to discover and integrate mCP servers [01:22:00].

  • An official mCP registry API is under development [01:22:30].
  • This will be a unified, hosted metadata service, built in the open, addressing questions like protocol discovery, server trust, and official verification [01:22:52].
  • The registry will also include versioning to track changes in APIs, tools, or tool descriptions [01:24:01].
  • Self-Evolving Agents: An mCP server registry allows agents to become self-evolving by dynamically discovering new capabilities and data on the fly [01:35:59]. An agent can search the registry for a verified server (e.g., Grafana), install or invoke it remotely, and then perform tasks like querying logs or fixing bugs, even if it wasn’t pre-programmed to do so [01:36:32]. This means agents can find and choose their own tools, making the augmented LLM system more powerful [01:37:06].
  • Well-Known mCP: A complementary approach to the registry is a “well-known” mCP JSON file (e.g., shopify.com/.well-known/mcp.json) [01:39:24]. This provides a verified, top-down approach for agents to discover and interact with specific services using their predefined APIs, complementing UI-based computer use models for long-tail interactions [01:40:07].

Medium-Term Considerations

  • Stateful vs. Stateless Connections: Exploring short-lived connections for basic capabilities (client asking server) and maintaining long-lived connections for advanced features like sampling or server-initiated notifications [01:41:41].
  • Streaming: How to support streaming data and multiple chunks of data arriving at the client from the server over time [01:42:41].
  • Namespacing: Addressing conflicts when multiple servers have tools with the same name, potentially through logical grouping of tools within the protocol [01:42:51].
  • Proactive Server Behavior: Developing patterns for event-driven or deterministic server behavior where the server initiates contact with the user or client to ask for information or provide notifications [01:43:31]. This includes scenarios where a server initiates sampling from scratch, perhaps triggered by an external API request [01:30:27].

Ultimately, Anthropic aims for its products to be the best mCP clients, but not the only ones, fostering competition and an open ecosystem that benefits users and developers [01:28:15].