From: aidotengineer
Tool calling is not merely “plumbing” for AI agents; it is more significant than some people realize [00:00:04]. While much attention is often given to improving agents, less focus has historically been placed on building reusable and robust tools for them [00:00:53]. However, this is changing, with more tool platforms and libraries emerging [00:01:06]. The agent’s effectiveness is often directly tied to the quality of its tools [00:02:56].
Understanding Tool Calling
At its core, tool calling allows an agent to interact with external functionalities or data sources [00:05:37]. For example, when an agent is asked a complex question like “How far does the moon orbit Express as the number of trips between Amsterdam and San Francisco?” [00:04:12], it might perform a series of tool calls. This could involve searching for distances, using a calculator, or querying a database [00:04:50]. These tools often connect to external APIs, web searches, or geographical databases [00:05:37].
A typical tool definition includes:
- Tool Name: Kept simple for clarity [00:06:06].
- Tool Description: Functions almost like a system prompt for the Large Language Model (LLM), guiding it on how to use the tool [00:06:09].
- Input Parameters: Specifies what the model needs to provide to call the tool [00:06:40].
- Output Schema: Increasingly important for type safety, structured outputs, and chaining tool calls [00:06:48].
Traditional Tool Calling
Traditional tool calling, as seen two to three years ago, involves explicit management of the tool calling logic by the application where the agent resides [00:11:11].
Process
In this model, the client application sends a prompt to the agent/AI application [00:09:23]. The application then puts this into a prompt and sends it to the model [00:09:29]. The model suggests a tool call, and the application explicitly looks for tool call messages, parses them, and executes them based on a callback function [00:10:47]. This requires handling things like queuing tool calls, retries, and errors within the application logic [00:11:01]. There is a lot of back and forth between the application and the LLM, as well as between the client app and the agent [00:10:08].
Characteristics
- Explicit Control: Developers have direct control over the tool calling process, including how tools are executed and how errors are handled [00:11:01].
- Complex Implementation: Requires developers to write specific logic for filtering tool call messages, parsing data, and managing execution flow [00:10:47].
- Separation of Concerns (Partial): While logic for tool calls is defined in the server app, it’s still tightly coupled with the agent’s implementation [00:09:48].
Embedded/Meta Tool Calling
Embedded tool calling, also referred to as meta tool calling, is a more recent approach where the tool calling logic is encapsulated within a closed system, often the agent framework itself [00:12:24].
Process
In this model, the client application simply asks a question [00:11:36]. The agent, acting as a black box, handles connecting with the LLM, calling the tools, and performing all necessary logic internally [00:12:11]. The tools are defined within the same agent framework, and the final answer is returned directly to the client [00:12:17]. For example, in LangChain, this is achieved using functions like create_react_agent
which take tools, a model, and a prompt, and handle the execution [00:12:46].
Characteristics
- Ease of Implementation: Simpler for developers, as they don’t need to worry about errors, retries, or explicit tool calling logic [00:13:13].
- Lack of Control: Developers have no control over the internal tool calling process, how decisions are made, or the format of outputs beyond the initial tool definition [00:13:19].
- Black Box System: The agent acts as a self-contained unit, abstracting away the complexities of tool interaction [00:12:24].
Evolution and Future Directions
While embedded tool calling offers simplicity, there’s a strong argument for greater separation of concerns in building AI applications [00:13:41].
Model Context Protocol (MCP)
Introduced by Anthropic, the Model Context Protocol (MCP) aims to separate the client-side (host) from the server-side in agentic applications [00:14:53]. The server, which has access to tools and assets, is the only component the host and client directly interact with [00:15:20]. This provides a clearer distinction between the front-end and back-end logic [00:15:43].
Standalone Tool Platforms
A growing trend is the development of standalone tool platforms [00:16:25]. These platforms allow tools to be defined and hosted separately from the agent framework [00:16:39]. The agent framework then imports these tools via an SDK or API call [00:16:48]. The tool creation and execution are handled remotely, offering greater flexibility and clear separation of concerns [00:16:54]. Examples include Compos Tool House, Arcade AI, and Wild Card [00:19:36].
Advantages of standalone tool platforms:
- Modularity: Tools can be created once and used across different agentic frameworks like LangChain, CrewAI, or Autogen [00:19:11].
- Centralized Management: Platforms can handle complex logic like chaining tools, authorization, and error handling [00:18:21].
- Flexibility: Allows for easy switching between agent frameworks without rebuilding tools [00:19:20].
Dynamic Tools
Instead of creating a multitude of static tools for every specific function (e.g., separate tools for “get customers,” “get customers by country,” “get orders”), dynamic tools offer a more generalized approach [00:22:09]. For example, a single tool can be created that connects to a GraphQL or SQL schema [00:22:38]. The LLM then generates the appropriate query (e.g., GraphQL query) based on the schema and the user’s request [00:22:57].
Considerations
While dynamic tools reduce the number of explicit tool definitions and allow existing business logic to be leveraged [00:24:29], they come with trade-offs. LLMs can sometimes hallucinate or generate incorrect syntax for queries, requiring careful handling [00:24:37]. However, there’s a strong potential for building a future around dynamic tools rather than static ones [00:24:48].