From: aidotengineer
The AI SDK enables developers to create sophisticated applications, including those capable of deep research. A deep research clone can process an initial query, generate sub-queries, search the web for relevant information, analyze findings, and recursively follow up on new questions to build a comprehensive report [00:25:00]. This process leverages the AI SDK’s capabilities for text generation, tool use (function calling), and structured data output [00:25:22].
AI SDK Fundamentals for Building Agents
Building agents with the AI SDK involves several core primitives that provide flexibility and power:
- Generate Text Function
The
generateText
function allows calling a large language model to produce text [00:01:06]. It accepts either a simpleprompt
or an array ofmessages
with specifiedrole
andcontent
[00:02:16]. - Unified Interface A key feature of the AI SDK is its unified interface, allowing developers to switch between different language models by changing a single line of code [00:02:41]. This flexibility is useful for optimizing costs, speed, or performance for specific use cases [00:02:50]. For example, seamlessly transitioning from OpenAI’s GPT-4o mini to Perplexity’s Sonar Pro or Google’s Gemini Flash 1.5, especially when web search capabilities are needed [00:03:53]. The SDK also provides access to sources used by models like Perplexity [00:04:48].
- Using Tools and Function Calling
Tools, or function calling, enable language models to interact with the outside world and perform actions [00:06:22].
- Mechanism: The model is given a prompt and a list of available tools, each with a name, description, and required parameters [00:06:29]. Instead of generating text, the model might generate a tool call, including the tool’s name and arguments parsed from the conversation context [00:06:58]. The developer then executes this tool call [00:07:13].
- Implementation: Tools are passed to
generateText
orstreamText
via atools
object. Thetool
utility function provides type safety between defined parameters and the arguments passed to theexecute
function [00:08:06]. The AI SDK automatically parses tool calls, invokes theexecute
function, and returns the result in atoolResults
array [00:09:23]. - Autonomous Multi-step Agents (
maxSteps
): To allow the model to incorporate tool results into a text answer, themaxSteps
property can be set [00:11:35]. If a tool call is generated, the tool result and previous context are sent back to the model, triggering another generation. This process continues until plain text is generated or the maximum number of steps is reached, enabling autonomous, multi-step agent behavior [00:12:03]. The model can even perform parallel tool calls [00:16:42].
- Generating Structured Data (Structured Outputs)
The AI SDK offers two ways to generate structured outputs:
generateText
withexperimental_output
: This option allows defining a schema (e.g., using Zod) for the expected output structure [00:18:49]. Zod, a TypeScript validation library, pairs well with the AI SDK for defining schemas and making structured outputs easy to work with [00:19:48].generateObject
Function: This function is specifically designed for structured outputs [00:18:55]. It takes a prompt and a schema, returning a type-safe object [00:22:09]. A useful feature is.describe()
in Zod, which allows providing detailed context for specific values in the schema, influencing the model’s output without altering the main prompt [00:23:14].generateObject
can also operate in enum mode for scenarios with a limited set of discrete values (e.g., “relevant” or “irrelevant”) [00:40:15].
Building a Deep Research Clone
The deep research workflow involves several sequential and recursive steps to gather and synthesize information:
Workflow Overview
- Generate Sub-queries: Convert a broad initial query into specific search queries [00:26:58].
- Search the Web: Find relevant results for each sub-query [00:27:17].
- Analyze Results: Extract learnings and identify follow-up questions [00:27:20].
- Recursive Follow-up: Take follow-up questions, generate new queries, and repeat the process while accumulating research [00:27:31].
- Synthesize Report: Aggregate all gathered information into a comprehensive report [00:27:59].
The process uses depth
to control how many levels deep the research goes and breadth
to determine how many different inquiries are pursued at each step [00:29:06].
Step 1: Generating Search Queries
The generateSearchQueries
function uses generateObject
to create a list of search-engine-optimized queries from a given prompt [00:29:50]. The function specifies a schema for an array of strings to ensure the output is structured as expected [00:30:38].
Step 2: Searching the Web
The searchWeb
function utilizes the Exa API to search the web for relevant results [00:32:53]. Key considerations include:
- Result Count: Limiting the number of results to optimize token usage [00:34:04].
- Live Crawl: Using
liveCrawl
to ensure results are current, accepting a potential hit on performance [00:34:25]. - Trimming Information: Only returning necessary data (e.g., title, URL, content) from search results to reduce token count and improve model effectiveness by removing irrelevant information like favicons [00:34:50].
Step 3: Analyzing Results and Iterating (Agentic Loop)
The searchAndProcess
function implements an agentic loop using generateText
with two tools:
searchWeb
: Executes the web search for a given query [00:38:57]. Search results are pushed to apendingSearchResults
array [00:39:24].evaluate
: Determines the relevance of the most recent search result [00:39:41]. It usesgenerateObject
in enum mode to classify results as “relevant” or “irrelevant” [00:40:15]. If irrelevant, it provides feedback to the model to search again with a more specific query, perpetuating themaxSteps
loop [00:40:56]. This tool also prevents re-using sources that have already been processed [00:52:54].
Step 4: Generating Learnings and Follow-up Questions
The generateLearnings
function uses generateObject
to analyze a relevant search result [00:43:41]. It extracts a learning
(insight) and identifies followUpQuestions
, both defined by a Zod schema [00:44:15]. The search result is passed as stringified XML within the prompt to provide context [00:44:08].
Step 5: Implementing Recursion and State Management
The deepResearch
function orchestrates the recursive nature of the research process:
- It takes an initial prompt, and recursively calls itself with new queries derived from follow-up questions [00:46:13].
depth
andbreadth
parameters control the extent of the recursive search [00:49:09].- A global or external
accumulatedResearch
state object stores all gathered information (original query, active queries, search results, learnings, completed queries) across recursive calls [00:48:10]. - The function decrements
depth
with each recursive call to ensure termination and prevent infinite loops, conserving API credits [00:50:01].
Step 6: Generating the Final Report
Once the recursive research is complete, the generateReport
function takes the accumulatedResearch
and uses generateText
to synthesize all information into a comprehensive report [00:54:39].
- Model Selection: A reasoning model like GPT-3.5 Turbo mini can be used for synthesis [00:54:57].
- System Prompt: A detailed system prompt is crucial for guiding the model on the desired report structure and persona (e.g., “expert researcher”), including markdown formatting, today’s date, and general research analyst-oriented guidelines [00:57:22]. This ensures a high-quality, structured output, minimizing the need for the model to infer formatting or tone [00:56:51].
This comprehensive workflow, built with the AI SDK, demonstrates how complex AI systems can be constructed by combining fundamental building blocks and leveraging agentic behavior to tackle challenging tasks like deep research [00:59:08].