From: aidotengineer
Function calling, or “tools,” enables large language models (LLMs) to interact with the outside world and perform actions beyond just generating text [00:06:17]. This capability is a core feature of the AI SDK [00:02:22] [00:02:22].
Core Concept of Tools
The fundamental idea behind tools is simple: you provide an LLM with a prompt and a list of available tools as part of the conversation context [00:06:29]. Each tool includes:
- Name: A unique identifier for the tool [00:06:40].
- Description: A clear explanation of what the tool does, which the model uses to decide when to invoke it [00:06:42] [00:08:13].
- Parameters: Any data required by the tool for its operation, which the model attempts to parse from the conversation context [00:06:47] [00:08:22].
Instead of generating a text response, the model will generate a “tool call” if it decides to use a tool to solve a user’s query [00:06:53]. This tool call includes the tool’s name and the arguments parsed from the conversation context [00:07:01]. The developer is then responsible for parsing that tool call, running the associated code, and handling the results [00:07:13].
Implementing Tools with AI SDK
generateText
with Tools
The generateText
or streamText
functions in the AI SDK accept a tools
object as input [00:07:51]. Within this object, you define each tool using the tool
utility function [00:08:06].
The tool
utility function provides:
description
: Explains the tool’s purpose to the model [00:08:13].parameters
: Defines the data schema required by the tool. This can be defined using Zod, a TypeScript validation library, which ensures type safety between the defined parameters and the arguments received by theexecute
function [00:08:22] [00:08:54] [00:19:47].execute
: An asynchronous JavaScript function that contains the arbitrary code to be run when the language model generates a tool call [00:08:33].
The AI SDK automatically parses any tool calls, invokes the execute
function, and returns the result in a toolResults
array [00:09:24].
import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
async function main() {
const { toolResults } = await generateText({
model: openai('gpt-4o-mini'),
prompt: "What's 10 + 5?",
tools: {
addNumbers: tool({
description: 'Adds two numbers together.',
parameters: {
type: 'object',
properties: {
num1: { type: 'number' },
num2: { type: 'number' },
},
required: ['num1', 'num2'],
},
execute: async ({ num1, num2 }) => num1 + num2,
}),
},
});
console.log(toolResults); // Output: [{ toolName: 'addNumbers', args: { num1: 10, num2: 5 }, result: 15 }]
}
main();
Building Agents with AI SDK using maxSteps
When a model generates a tool call, it typically doesn’t generate subsequent text [00:10:05]. To allow the model to incorporate tool results into a generated text answer or perform multiple actions, the maxSteps
property is used [00:11:35].
maxSteps
enables an “agentic loop” [00:12:16]:
- If the model generates a tool call, the
toolResult
is sent back to the model along with the previous conversation context [00:11:50]. - This triggers another generation, allowing the model to continue autonomously until it either generates plain text or reaches the
maxSteps
threshold [00:12:03].
This allows the model to pick the next step in a process without requiring complex manual logic or rerouting [00:12:21].
Example: Parallel Tool Calls and Multi-Step Reasoning
A more complex example demonstrates an agent using multiple tools, potentially in parallel, and then synthesizing the results [00:13:50]. For instance, asking for weather in multiple cities and then adding temperatures together [00:15:18]. The model can infer missing parameters (like latitude/longitude for a city) from the context [00:15:05].
import { generateText, tool } from 'ai';
import { openai } from '@ai-sdk/openai';
async function main() {
const { text, toolResults } = await generateText({
model: openai('gpt-4o-mini'),
prompt: "Get the weather in San Francisco and New York and then add them together.",
maxSteps: 3, // Allows multiple steps of tool calls and text generation
tools: {
getWeather: tool({
description: 'Get the current weather at a location.',
parameters: {
type: 'object',
properties: {
latitude: { type: 'number' },
longitude: { type: 'number' },
city: { type: 'string' }, // Model can infer lat/long from city
},
required: ['city'],
},
execute: async ({ city, latitude, longitude }) => {
// Simulate weather API call
if (city === 'San Francisco') return { temperature: 12.3 };
if (city === 'New York') return { temperature: 15.2 };
return { temperature: 0 };
},
}),
addNumbers: tool({
description: 'Adds two numbers together.',
parameters: {
type: 'object',
properties: {
num1: { type: 'number' },
num2: { type: 'number' },
},
required: ['num1', 'num2'],
},
execute: async ({ num1, num2 }) => num1 + num2,
}),
},
});
console.log(text); // Expected: "The current temperature in San Francisco is 12.3° C. In New York, it's 15.2. When you add these temperatures together, you get 27.5 degrees CC."
console.log(toolResults); // Will show sequence of tool calls and their results
}
main();
Generating Structured Data (Structured Outputs)
Beyond text, the AI SDK allows models to generate structured data, which is useful for programmatic integration [00:18:38].
generateText
with experimental_output
The generateText
function includes an experimental_output
option to define the desired structure using a Zod schema [00:18:46] [00:19:30].
import { generateText, tool, experimental_output } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
async function main() {
const { experimental_output: result } = await generateText({
model: openai('gpt-4o-mini'),
prompt: "Get the weather in San Francisco and New York and then add them together.",
maxSteps: 3,
tools: { /* ... (same tools as above) ... */ },
experimental_output: experimental_output(
z.object({
sum: z.number().describe('The sum of the temperatures.'),
})
),
});
console.log(result.sum); // Direct access to the sum as a number
}
main();
generateObject
Function
The generateObject
function is specifically dedicated to structured outputs and is highly favored for this purpose [00:18:55]. It simplifies defining the output schema.
import { generateObject } from 'ai';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';
async function main() {
const { object } = await generateObject({
model: openai('gpt-4o-mini'),
prompt: 'Please come up with 10 definitions for AI agents.',
schema: z.object({
definitions: z.array(z.string().describe('Each definition should be completely incoherent and use as much jargon as possible.')),
}),
});
console.log(object.definitions);
}
main();
[00:21:00] [00:22:09] [00:23:14]
The ZOD.describe
function can be chained to any Zod schema element to provide detailed instructions to the language model about the desired characteristics of that specific value [00:23:17].
generateObject
also supports an “enum mode” for when the output needs to be one of a few predefined values (e.g., ‘relevant’ or ‘irrelevant’), making it very ergonomic and easier for the model [00:40:15].
Developing Custom AI Tools and Functions in Complex Systems
Integration of AI coding agents with third-party tools is crucial for building complex AI systems. The principles of tool calling and structured outputs are foundational for advanced applications, such as a “deep research clone” [00:24:39] [00:25:24].
Example: Deep Research Clone Workflow
A deep research clone workflow exemplifies using reasoning models and tool calls in AI in a structured way [00:26:50]:
- Generate Subqueries: An initial query is broken down into multiple search queries using
generateObject
[00:29:48]. - Search the Web: A
searchWeb
function (e.g., using Exa API) retrieves relevant results for each subquery [00:32:53]. It’s crucial to filter extraneous information from search results to reduce token count and improve model effectiveness [00:35:02]. - Analyze and Process: The
searchAndProcess
function acts as an agent, usinggenerateText
withmaxSteps
and two key tools:searchWeb
: Performs the actual web search.evaluate
: Determines if the search results are relevant. This tool can also prevent re-using already processed sources [00:39:39] [00:52:22]. Large search results should be handled as local variables rather than tool parameters to avoid unnecessary token consumption [00:41:30].
- Generate Learnings: The
generateLearnings
function usesgenerateObject
to extract insights (“learnings”) and “follow-up questions” from relevant search results [00:43:39]. - Recursion and State Management: The entire process is encapsulated in a recursive
deepResearch
function [00:46:13]. Accumulated research (original query, active queries, search results, learnings, completed queries) is stored in a mutable state object, allowing the model to go deeper into sub-topics while retaining context [00:46:53]. - Generate Report: Finally, a
generateReport
function uses a larger reasoning model (e.g., GPT-4o-mini) to synthesize all accumulated research into a comprehensive report [00:54:26]. Providing a detailed system prompt (e.g., persona, date, markdown formatting, guidelines) significantly improves the quality and structure of the final output [00:57:22].
This workflow demonstrates how different AI SDK functions can be combined to build sophisticated, multi-step AI systems that solve complex problems [00:25:22] [00:26:24].