Challenges of building robust and reusable tools

From: aidotengineer

While much attention is given to improving AI agents, the development of robust and reusable tools for these agents is often overlooked [00:00:58]. This oversight can lead to significant issues, as an agent’s effectiveness is often limited by the quality of its tools [00:02:56].

The Overlooked Importance of Tools

Developers frequently focus on enhancing agents but spend less time on building tools that are reusable, robust, and adaptable across different frameworks [00:00:58]. The application layer, specifically where tools are constructed, merits increased focus [00:03:25]. Often, the first component to fail in an agent is its tools [00:02:37]. Issues can include:

The Large Language Model (LLM) failing to call a tool correctly [00:02:39].
The LLM using the incorrect tool [00:02:41].
Internal failures within the tool itself [00:02:46].

Agents are seen as closed circuits, whereas tools are more dynamic [00:02:50]. Therefore, the agent is only as effective as the tools it employs [00:03:11].

Key Aspects of Tool Definition

The definition of a tool is crucial for its functionality and reusability:

Tool Name: Should be kept simple [00:06:06].
Tool Description: Acts almost like a system prompt for the LLM, guiding its usage [00:06:10]. Longer, more detailed descriptions are common in larger, “agentic” tools [00:06:21].
Input Parameters: Inform the model what is required to call the tool [00:06:40].
Output Schema: Increasingly vital, it enables type safety, structured outputs, and the chaining of tool calls [00:06:48]. Without it, the model may not know what data is returned [00:07:06].

Tool Calling Paradigms and Their Impact on Reusability

Understanding where a tool is called from is becoming increasingly important [00:08:18]. Tools are often built and defined in the same place as the agents themselves [00:08:22].

Traditional Tool Calling

In a traditional setup, the client application sends a prompt to an agent or AI application, which then passes it to the LLM [00:09:23]. The LLM recommends tool calls, and the application’s server-side logic defines and executes these tools based on the LLM’s recommendations [00:09:42]. This involves:

Explicitly looking for tool call messages from the LLM [00:10:47].
Parsing and executing the messages based on callback functions [00:10:54].
Handling tool call queuing, retries, and errors [00:11:01].

This method involves significant back-and-forth between the application logic and the LLM, making it less of a “closed system” [00:10:10]. While it offers more control over the tool calling process, it places the burden of managing complex logic on the developer [00:11:06].

Embedded Tool Calling

Most modern frameworks utilize embedded tool calling, where the agent acts as a black box [00:11:22]. The client application simply asks a question, and the agent, as a self-contained system, handles all tool connections, LLM interactions, and tool calls internally, returning only the final answer [00:11:34].

Tool definitions are contained within the agent [00:12:17].
It is easier to implement for beginners as it abstracts away error handling and retries [00:13:13].
However, it offers no control over the tool calling process, execution decisions, or data format beyond the initial callback function [00:13:19]. This lack of control limits reusability, as tools are tightly coupled to the agent framework.

Strategies for Building Reusable and Robust Tools

To overcome the challenges of tightly coupled tools and enhance reusability, several architectural strategies can be employed.

Separation of Concerns

Adopting a “separation of concerns” approach, akin to decoupling backend and frontend in web development, is beneficial [00:13:41]. This means keeping different parts of your system distinct, even if they reside in the same repository [00:14:30].

Model Context Protocol (MCP)

The Model Context Protocol (MCP), introduced by Anthropic, facilitates the separation of client-side (host) and server-side components in agentic applications [00:14:53].

The host (e.g., desktop client) contains a client that connects to servers [00:15:12].
Servers act as backends, providing access to tools or data files [00:15:20].
The host/client interacts with the server, but does not directly see the tools, which are made available through the server [00:15:29].
Tool logic and definitions are managed on the MCP server, allowing for a clear distinction between front-end and back-end logic [00:15:43]. This approach makes tool calling truly separate from the agentic framework [00:16:12].

Standalone Tool Platforms

Standalone tool platforms offer a more distinct separation. Instead of defining tools within a closed agent loop, they are defined separately, often hosted on remote servers [00:16:39].

The agent imports these tools via an SDK or API call [00:16:48].
Tool creation and execution are handled independently [00:16:54].
The agent’s role is to take the tool definition, use the LLM to decide which tool to call, and pass this decision to the tool platform for execution [00:17:02].
These platforms often allow for chaining tools, handling authorization, and managing errors [00:17:11].
This separation provides significant flexibility, allowing developers to build tools once and use them across different agent frameworks (e.g., LangChain, CrewAI, AutoGen) [00:19:11]. This means a developer can swap agent frameworks without rebuilding their toolset [00:20:59].

Dynamic Tools (e.g., using GraphQL or SQL)

Rather than creating numerous static tools for every specific function (e.g., a separate tool for each customer search filter in a CRM), dynamic tools leverage query languages like GraphQL or SQL [00:21:35].

A single dynamic tool can connect to a GraphQL schema or a database [00:22:29].
The LLM generates the specific query (e.g., GraphQL query) based on the user’s request and the provided schema [00:22:55].
The LLM is given the tool and the schema, understanding that it needs to generate a valid query [00:22:59].
This approach reduces the number of tool definitions and avoids duplicating business logic [00:24:25].
LLMs, particularly models like Llama, OpenAI, and Claude, are capable of generating complex queries such as GraphQL, including concepts like fragments [00:23:29].
Trade-offs: LLMs can sometimes hallucinate or mess up syntax, particularly with complex schemas or deep nesting [00:24:37].

In conclusion, as the focus shifts to building more sophisticated AI agents, it is imperative to equally prioritize the development of robust, reusable tools [00:24:58].

Tubegraph

Explorer

Table of Contents