Importance of tool calling in AI

From: aidotengineer

Tool calling is considered more significant than some people realize and is crucial for agentic frameworks [00:00:04]. It is not merely “plumbing” for AI agents [00:00:04].

Why Tool Calling Matters

While much attention is given to improving AI agents, the focus on building reusable and robust tools for these agents has been less prominent [00:00:55]. However, this trend is changing, with more tool platforms and libraries emerging to facilitate tool building [00:01:06]. A common issue is that tools are often the first component to break when building an agent, whether due to incorrect calls by the Large Language Model (LLM) or internal tool failures [00:02:35].

“The agent is only as good as their tools.” [00:02:56]

As LLMs have advanced significantly, the application layer—especially where tools are built—deserves more attention [00:03:20]. Wrappers around models or chat interfaces offer substantial opportunities for improvement by enabling smart software solutions on top of the core models [00:03:31].

Understanding Tool Calls

When an agent receives a complex query, such as “How far does the moon orbit expressed as the number of trips between Amsterdam and San Francisco?”, it often performs a series of tool calls [00:04:10]. This could involve searching for the distance between the Earth and the Moon, the distance between the two cities, and then performing calculations, potentially using external APIs like Wolfram Alpha [00:04:50].

Defining Tools

The way a tool is defined significantly impacts its utility [00:05:56]. Key components of a tool definition include:

Tool Name: Should be kept simple [00:06:04].
Tool Description: Acts almost like a system prompt for the LLM, guiding its usage. Larger tools often have lengthy, detailed descriptions [00:06:09].
Input Parameters: Specifies what the model needs to provide to call the tool [00:06:40].
Output Schema: Increasingly important for type-safe tools, enabling structured outputs and the chaining of tool calls [00:06:46].

Evolution of Tool Calling Paradigms

Historically, tools were often built and defined within the same agent framework [00:02:20]. However, there’s a shift towards more separated and flexible approaches.

Traditional Tool Calling

In traditional tool calling, the client application interacts with an agent or AI application, which then sends prompts to the LLM [00:09:02]. The application explicitly defines the logic to filter tool call messages from the LLM’s response, parse them, and execute them via callback functions [00:09:42]. This approach involves a significant amount of back-and-forth between the application and the LLM and requires handling queuing, retries, and errors [00:10:08].

[!example]

# Simplified LangChain example for traditional tool calling
tools = [
    Tool(
        name="get_customer_count",
        func=get_customer_count_callback,
        description="..."
    )
]
# Model and agent setup...
# Explicitly look for tool call messages and execute them

This method was common in early agentic setups [00:11:11].

Embedded Tool Calling

Modern frameworks often feature “embedded tool calling,” where the system acts as a closed black box [00:11:19]. The client application sends a question to the agent, which handles all tool calling logic internally, connecting with the LLM and tools, and then returns the final answer [00:11:34].

[!example]

# Simplified LangChain example for embedded tool calling
from langchain.agents import AgentExecutor, create_react_agent
# Define tools, model, prompt...
agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)
# All tool calling logic is hidden within create_react_agent and AgentExecutor

While easier to implement for beginners as it abstracts away error handling and retries, embedded tool calling offers less control over the tool calling process and decision-making [00:13:10].

Strategies for Separation of Concerns

As a developer, separating concerns is often preferred to increase flexibility and maintainability [00:13:41].

Model Context Protocol (MCP)

Introduced by Anthropic, the Model Context Protocol (MCP) is a protocol for separating the client-side and server-side logic in agentic applications [00:14:51].

Host/Client: The host, such as Cloud Desktop, contains a client capable of connecting to servers [00:15:12].
Servers: Act as backends with access to tools or assets like data files [00:15:20].

The client interacts only with the server, which handles the tool logic and definitions, providing a clear distinction between front-end and back-end responsibilities [00:15:29].

Standalone Tool Platforms

A growing trend is the use of standalone tool platforms, which allow tools to be defined and hosted separately from the agentic framework [00:16:25].

Tools are created and executed on remote servers [00:16:54].
The agent imports these tools via an SDK or API calls [00:16:50].
The agent’s role is to use the LLM to decide which tool to call and then pass the request to the tool platform for execution [00:17:02].

These platforms typically offer:

Tool Creation: Via code or CLIs [00:17:40].
Tool Execution/Surfacing: Via SDKs that integrate with frameworks like LangChain or CrewAI [00:17:46].
Tool Chaining: The ability to sequence multiple tool calls within the platform, handling complex workflows (e.g., getting user country, then customer count) [00:17:59].
Authorization and Error Handling: Centralized management of credentials and errors for various connected systems [00:18:25].

This separation offers flexibility, allowing developers to build tools once and use them across different agentic frameworks [00:19:11].

Dynamic Tools

Rather than creating a multitude of static tools for every possible operation (e.g., separate tools for “get customers by country,” “get customers by search filter”), dynamic tools allow the LLM to generate queries for existing APIs or databases [00:21:24].

For example, a single dynamic tool can connect to a GraphQL schema [00:22:36]. The LLM is given the GraphQL schema and instructed to generate valid GraphQL queries [00:23:01]. This allows the LLM to understand type definitions and available operations (like getCustomers or getOrders) [00:23:14].

[!info] Models such as Llama, OpenAI models, and Claude are particularly good at generating GraphQL queries, especially when schemas are simple and nesting is not too deep [00:23:29].

Dynamic tools reduce the need to define many individual tools and prevent duplication of business logic, leading to more flexible integrations with existing APIs and databases [00:24:16]. However, there are trade-offs, as LLMs can sometimes hallucinate or generate syntactically incorrect queries [00:24:36]. Despite these challenges, the future of dynamic tools appears promising [00:24:48].

Tubegraph

Explorer

Table of Contents