AI integration and tool calling

From: aidotengineer

The field of Artificial Intelligence (AI) is undergoing a significant revolution, moving beyond hype to real-world applications and widespread adoption [00:17:24]. This shift is powered by technologies that enable AI models to interact with the outside world, perform complex tasks, and integrate seamlessly into existing workflows. Key to this evolution are concepts like tool calling and the development of robust AI application frameworks.

The Real Revolution of AI

Unlike past technologies like blockchain or NFTs, AI is demonstrating tangible impact through widespread user adoption and significant revenue generation [00:17:48].

ChatGPT achieved 100 million users faster than any consumer product in tech history, with millions using it daily to accomplish tasks [00:18:04].
GitHub Copilot boasts millions of subscribers and is integrated into Microsoft 365, reaching 84 million everyday consumers [00:18:35].
Azure AI is being adopted by enterprises, generating an annual revenue of $13 billion [00:18:43].
The AI Engineer World’s Fair in 2025 hosted over 3,000 attendees, nearly twice as many as the previous year, all actively building and using AI technologies [00:21:11].

Evolution of AI Engineering

The field of AI engineering has matured rapidly. In 2023, discussions centered on the three types of AI engineers, evolving to a focus on multi-disciplinary approaches in 2024 with the introduction of multiple conference tracks [00:25:38]. By 2025, the focus shifted to agent engineering [00:26:01]. Initially, AI engineering and “GPT wrappers” were derided, but now engineers in this space are thriving [00:26:09].

A consistent lesson is the importance of not overcomplicating solutions [00:26:32]. The field is still early, with significant “alpha to mind” [00:27:00].

Standard Models in AI Engineering

The industry is seeking a “standard model” for AI engineering, similar to established concepts like ETL, MVC, CRUD, or MapReduce in traditional software engineering [00:28:07].

Candidates for Standard Models:

The MOS (Model, Optimizer, and System): Initially proposed by Karpathy in 2023, it has been updated for 2025 to include multimodality, standard toolsets, and MCP as a default external communication protocol [00:28:54].
LLM SDLC (Software Development Life Cycle): Early parts of the LLM SDLC, such as LLMs themselves, monitoring, and RAG (Retrieval-Augmented Generation), are becoming commoditized or available through free tiers [00:29:42]. Real value is generated when focusing on evaluations, security orchestration, and doing the “hard engineering work” [00:29:58]. This pushes AI engineering from demos to production [00:30:11].
Building Effective Agents: This is becoming the received wisdom for how to build an agent, although definitions continue to iterate [00:30:27]. Instead of arguing about definitions, the focus should be on the ratio of human input to valuable AI output [00:32:10]. This includes understanding intent, control flow, memory planning, and tool use [00:31:06].
S.P.A.D.E. (Sync, Plan, Analyze, Deliver, Evaluate): A generalized model for building AI-intensive applications involving thousands of AI calls [00:33:47]. This includes processing data into knowledge graphs, generating structured outputs, and creating code artifacts [00:34:21].

The Agentic Web and Tooling

The emergence of diverse reasoning models and their increased efficiency are giving rise to the “agentic web” [00:37:37]. This is a world where agents interact with tools, models, and other agents across different clouds, companies, and devices [00:38:04].

Key forces in AI engineering within the agentic web include:

Transitioning from pair programming to peer programming, with tools like Copilot becoming active teammates [00:38:24].
Moving from a software factory to an agent factory, focusing on shipping behaviors rather than just binaries [00:38:33].
Models increasingly residing on devices, enabling local execution and low latency [00:38:42].

Platforms and Tools for Agent Development

Microsoft’s core AI platform aims to empower users to shape the world with AI [00:42:42].

Foundry: This platform enables building agents that can retrain, redeploy, and change post-deployment [00:46:46]. It supports an “agent factory” with baked-in trust and security, and seamless cloud-to-edge operation [00:38:53].
Signals Loop: A continuous loop emerging in agent development, where fine-tuning models to personalize outcomes leads to better results, observed in applications like Dragon, a healthcare Copilot [00:45:52].
Intelligent Routing: Foundry offers a switchboard for intelligent routing, providing access to 10,000 open and proprietary models, backed by security and data residency [00:47:12].
Agentic RAG: An improvement over traditional RAG, offering multi-shot iteration and evaluation, leading to a 40% improvement in accuracy on complex queries [00:47:37].
Tooling as Infrastructure: Tooling is becoming infrastructure, requiring code and containers. Platforms like Foundry provide over 1500 tools and were early adopters of MCP and A2A (Agent-to-Agent) protocols [00:47:53].
Accountability: Aggressive efforts are being made in evaluation SDKs and red teaming agents, with continuous observability through integration with OpenTelemetry [00:48:07].

Over 50,000 agents are built daily using these platforms [00:48:29].

Examples of AI-Powered Development Tools:

GitHub Copilot spaces: Allows users to create a Copilot space grounded in specific files, enabling it to answer questions about a project’s facts [00:42:01].
Copilot for task automation: Can be assigned tasks, such as writing a README file, and can run tests until complete [00:42:26].
Extending Copilot with other agents: GitHub Copilot within VS Code can communicate with other agents, like “Amaly MLE” (machine learning engineer agent), to reason about code and write new code [00:43:43].

Focus on MCP

The Model Context Protocol (MCP) is highlighted as a foundational shift in the internet’s economy, where tool calls are becoming the new clicks [02:31:05].

Origin and Evolution of MCP

The genesis of MCP stemmed from the need to address “copy and paste hell,” where AI was disconnected from the rest of the world [02:30:00]. Anthropic’s co-creators conceived of MCP from the observation that engineers were constantly copying context from external sources (like Slack or error logs) into LLM context windows [02:33:20].

The core idea was to enable models to “climb out of their box” and interact with the outside world, giving them model agency [02:33:48]. This led to the conclusion that an open-source, standardized protocol was necessary for scalability [02:34:21]. MCP was open-sourced in November 2023 [02:36:08].

Early on, many questioned the need for a new open-source protocol [02:36:40]. A turning point was when Cursor and other coding tools adopted MCP, giving builders hands-on experience and demonstrating its utility [02:37:18]. More recently, major players like Google, Microsoft, and OpenAI have also adopted MCP, cementing its status as a standard [02:37:50].

Key Principles and Features of MCP

Model Agency: The protocol is designed to give models the intelligence to choose actions and decide what to do, similar to how humans delegate tasks [02:39:14].
Server Simplicity: Optimizing for server simplicity means that when trade-offs are necessary between client and server complexity, the complexity is pushed to the client, as it is believed there will be significantly more servers than clients [02:40:23].
Streamable HTTP: Recent support for streamable HTTP enables more bidirectionality, crucial for agents to communicate with each other [02:39:57].
OAUTH Fixes: Initial issues with OAuth implementation were resolved through community contributions, demonstrating the value of open-source collaboration [02:41:23].
Elicitation: Allows servers to request more information from end-users, enabling better-tailored responses (e.g., clarifying “best flight” means cheapest or fastest) [02:42:30].
Registry API: Aims to make it easier for models to discover MCPs that were not given to them upfront, further enhancing model agency [02:43:08].
Debugging Tools: Inspector is highlighted as an underutilized debugging tool for MCP servers [02:42:14].

Building with MCP

Theo from Anthropic suggests key areas for building in the MCP ecosystem:

More and Higher-Quality Servers: A significant opportunity exists to build more servers, particularly for verticals beyond developer tools like sales, finance, legal, and education [02:45:17]. Servers should be designed with three users in mind: the end-user, the client developer, and the model itself, ensuring tools exposed to the model enable correct responses [02:45:58].
Simplifying Server Building: Tooling that makes it easier for enterprises and indie hackers to build servers is crucial, including hosting, testing, evaluation, and deployment tools [02:46:52].
Automated MCP Server Generation: A moonshot for the future, where models become so adept at writing code that they can generate their own MCPs on the fly [02:47:44].
AI Security, Observability, and Auditing: As AI applications gain access to real-world data, security and privacy implications increase, making tooling in this area essential [02:48:23].

Implementation of Tool Calling and MCP at Scale

Anthropic, through John, shared experiences in implementing MCP clients and talking to remote MCP at scale [02:50:10].

Integration Chaos: The rapid proliferation of custom endpoints for every use case led to duplication of functionality and inconsistent integrations [02:51:25].
Standardization: Standardizing on MCP internally helps streamline processes, allows engineers to focus on unique problems, and provides solutions for unforeseen issues like billing models and token limits [02:54:11].
MCP Gateway: Anthropic designed a shared infrastructure service, the MCP gateway, as a single point of entry. It handles credential management, centralized rate limiting, and observability, making it easy for engineers to connect to MCP services [02:58:07].
Portable Credentials: The gateway enables portable credentials, so batch jobs or internal services don’t require re-authentication [03:01:37].

Hidden Capabilities and Challenges of MCP

Harold from VS Code discussed the lesser-known aspects of MCP, aiming to unlock its full potential for rich, stateful interactions [03:04:04].

“MCP is just another API wrapper” syndrome: Many implementations only leverage tools, neglecting other powerful features like dynamic discovery, persistent resources, and rich interactions [03:05:39].
Tool Quality over Quantity: Too many tools or tools across too many domains can confuse AI models [03:07:45]. Client-side controls like per-chat tool selection and user-defined tool sets can help manage this [03:08:21].
Dynamic Tool Discovery: Allows servers to dynamically provide tools based on the current context, such as a “battle” tool appearing only when a monster is present in a game [03:09:41].
Resources: Enable the server to return references to files or data rather than the full content, allowing the LLM or user to follow up [03:10:26]. This can also customize responses based on the user’s environment (e.g., React vs. Svelte setup) [03:11:06].
Sampling: Allows the server to request LLM completions from the client, enabling features like summarizing resources or formatting web content [03:11:51].
Developer Experience: Tools like VS Code’s dev mode provide direct debugging and logging for MCP servers, improving the development process [03:13:15].
Community Registry: Efforts are underway to create a community registry for MCP servers to ease discovery and collaboration [03:15:32].

Practical Considerations for MCP Implementation

David from Sentry provided a “rant” on the practical challenges and learnings from building an MCP server for a B2B SaaS company [03:19:05].

Focus on Remote MCP: For cloud services, the remote MCP interface with OAuth 2.1 is key, rather than standard IO, due to security, iteration speed, and existing infrastructure advantages [03:27:46].
Beyond Open API: MCP is not a simple wrapper for existing APIs [03:26:04]. Developers must design tools specifically for AI models, massaging data and focusing on how an agent would use context [03:26:18]. Returning structured Markdown is often more effective than raw JSON [03:29:42].
Client Support: Early adoption faces challenges with inconsistent client support for MCP features like OAuth [03:26:36].
Security Risks: Allowing random MCP tools in an organization is dangerous due to prompt injection and data exfiltration risks [03:28:40]. Trust and vetting are essential.
Cost Management: The cost of tool calls can escalate rapidly with large token counts [03:31:52]. Thoughtful design to limit token usage is necessary.
Continuous Evolution: MCP implementations are not “set and forget”; continuous updates and tweaks are required [03:32:29].
Focus on Agents: The true value unlock is in building agents and exposing them through the MCP architecture [03:33:38]. Controlling the agent provides control over the prompt, tool results, and even the model itself [03:33:11].

Ultimately, while MCP and agent architecture introduce new terms, they represent familiar software engineering challenges and solutions, albeit with a new “coat of paint” [03:34:58]. The core opportunity lies in reimagining experiences and building “thick” workflow wrappers that enhance user value [01:16:44].

Tubegraph

Explorer

Table of Contents