From: aidotengineer
This article discusses the importance of tool calling in AI agents, emphasizing that it is more than just “plumbing” [00:00:04]. The goal is to empower users to build their own tools, separate from agentic frameworks, to achieve greater flexibility [00:40:00].

The Overlooked Importance of Tools

While much attention is given to improving agents, less time has historically been spent on building reusable and robust tools that can be integrated into different frameworks [00:53:00]. However, this is changing, with more tool platforms and libraries emerging [01:06:00]. The general sentiment is shifting, recognizing that the “pressure is sort of off the agents” as people are now trying to improve the tools [01:12:00].

An agent’s effectiveness is directly tied to the quality of its tools [02:56:00]. Frameworks offer significant support for agents but less for tool development itself [03:02:00]. The application layer, specifically where tools are built, deserves more attention as large language models (LLMs) continue to advance [03:25:00].

Understanding Tool Calling

Tool calling allows an agent to answer complex questions by performing a series of actions, such as calculating the distance of the moon’s orbit in terms of trips between two cities [04:10:00]. While an LLM might sometimes generate an answer directly from its training data [04:38:00], it more commonly uses tool calls to:

  • Search for external information (e.g., distance between Earth and Moon, or cities) via web searches or geographical databases [05:51:00].
  • Perform calculations using internal functions (e.g., JavaScript, Python) or external APIs (e.g., Wolfram Alpha) [05:44:00].

Defining Tools

The definition of a tool significantly impacts its utility. Key components include:

  • Tool Name: Advised to keep simple [06:06:00].
  • Tool Description: Acts like a system prompt for the LLM, guiding its usage. Larger, more agentic tools often have detailed descriptions [06:09:00].
  • Input Parameters: Specifies what the model needs to provide to call the tool [06:40:00].
  • Output Schema: An increasingly important feature in agentic frameworks, allowing for type safety, structured outputs, and chaining of tool calls [06:46:00]. Without an output schema, the model relies on a string return, making complex sequences difficult to manage [07:06:00].

Evolution of Tool Calling within Agent Frameworks

The way tools are called within agent frameworks has evolved.

Traditional Tool Calling

This approach involves significant back-and-forth communication between the client application, the agent/AI application, and the large language model [09:02:00].

  • The client sends a prompt to the agent application [09:15:00].
  • The agent application constructs a prompt and sends it to the LLM [09:29:00].
  • The LLM recommends a tool call, and the application must explicitly parse and execute these tool call messages based on defined callback functions [09:44:00].
  • The tool response is sent back to the model, and this process repeats until an answer is formed [09:57:00].

This method, common in earlier agentic implementations, requires manual handling of tool queuing, retries, and errors [11:01:00].

Embedded Tool Calling (Black Box)

This is the prevailing approach in many modern frameworks today, where the system acts as a closed circuit [11:22:00].

  • The client application simply sends a question to the agent [11:34:00].
  • The agent, which is a closed system, handles all interactions with the LLM and the tools internally, performing the tool calls and returning the final answer [11:42:00].
  • The tool calling logic is entirely managed by the agent framework, acting as a black box [12:09:00].

While easier to implement for beginners as it abstracts away error handling and retries [13:13:00], embedded tool calling offers limited control over the tool calling process, how decisions are made, or the output format [13:19:00].

Separation of Concerns in Agent and Tool Frameworks

A preference for separation of concerns in software development leads to a desire for more distinct systems for agents and tools [13:41:00].

Model Context Protocol (MCP)

The Model Context Protocol (MCP), introduced by Anthropic and adopted by many, is a significant step towards separating client-side and server-side components in agentic applications [14:51:00].

  • A “host” (e.g., Cloud Desktop) with a client connects to “servers” [15:10:00].
  • These servers act as backends that have access to tools or assets like data files [15:20:00].
  • The host/client interacts only with the server, not directly with the tools [15:29:00].
  • The MCP server handles the logic, tool definition, and tool imports, providing a clear distinction between frontend and backend concerns [15:48:00].

Standalone Tool Platforms

A growing trend is the use of standalone tool platforms, which offer greater flexibility by separating tool creation and hosting from the agentic framework [16:25:00].

  • Tools are defined and hosted on remote servers [16:52:00].
  • The agentic framework imports these tools via SDKs or API calls [16:48:00].
  • The agent decides which tool to call using the LLM and then passes the request to the tool platform for execution [17:02:00].
  • These platforms often allow for chaining of tools (e.g., getting a user’s country then finding customers in that country) [18:02:00] and handle authorization and error management [18:25:00].
  • This separation allows developers to build tools once and use them across different agentic frameworks (e.g., LangChain, CrewAI, AutoGen) [19:00:00].
  • Examples of such platforms include IBM’s WX Flow, Compos Tool House, Arcade AI, and Wild Card [19:31:00].

Dynamic Tools

Instead of creating a separate static tool for every function or query (e.g., a million tools for a CRM with various filters and data types) [21:35:00], dynamic tools offer a more flexible approach [22:29:00].

  • A single dynamic tool can connect to an API schema (like GraphQL) or a database (like SQL) [22:38:00].
  • The LLM is then instructed to generate the appropriate query (e.g., GraphQL query) based on the provided schema and user request [22:55:00].
  • Models like Llama, OpenAI, and Claude are adept at generating valid GraphQL queries when provided with the schema, though complex features like custom scalars or deep nesting can cause issues [23:29:00].

This approach reduces implementation effort in downstream integration and avoids duplicating business logic [24:18:00]. However, there are trade-offs, as LLMs can hallucinate, leading to poorly formed queries [24:37:00]. Despite this, dynamic tools represent a significant future direction compared to static tools [24:48:00].

In conclusion, as AI agents become more prevalent, it is crucial to recognize and prioritize the development of robust, reusable tools, ensuring they are as important as the agents themselves [24:58:00].