From: aidotengineer

The process of building AI agents is undergoing a significant transformation, moving away from traditional frameworks towards a more modular approach using AI primitives [00:00:05]. This method aims to accelerate the development of production-ready AI agents [00:04:50].

The Shift from Frameworks to Primitives

Many production-ready AI agents, such as Perplexity, Cursor, V0, and Chai, are not built on top of AI frameworks [00:00:51], [00:01:13], [00:25:28].

Limitations of AI Frameworks

Frameworks are often criticized for being bloated, moving slowly, and containing unnecessary abstractions that hinder development [00:01:21], [00:05:01]. In a rapidly evolving field where new paradigms and LLMs emerge weekly, frameworks can quickly become outdated or inflexible [00:25:51].

Instead of frameworks, the recommended approach is building on top of AI primitives [00:01:31]. Primitives possess a native ability to perform well in production environments [00:02:29]. A good analogy is Amazon S3, a low-level primitive for data storage that scales massively without being a framework [00:02:33].

Defining AI Agents

AI agents represent a new paradigm for writing code, fundamentally changing how coding projects and SaaS products are built [00:03:32], [00:03:40]. They are designed to automate tasks and interact with data or other systems.

Core AI Primitives for Agent Development

AI primitives are small, composable building blocks that are useful across various stacks [00:03:56], [00:05:17]. Key primitives for building AI agents include [00:10:23]:

  • Memory: An autonomous knowledge engine that can include a vector store, capable of scaling to terabytes of data for long-term storage and search [00:05:24], [00:10:28], [00:12:19].
  • Workflow Engine: Purpose-built for multi-step agent tasks [00:10:30].
  • Threads: Used to store and manage the context and history of conversations with an agent [00:04:01], [00:10:35], [00:11:39]. This includes “scratch pads” for temporary, relevant information [00:11:58].
  • Parser: Extracts context from various file formats, such as converting PDF to text [00:08:14], [00:10:41].
  • Chunker: Splits extracted text into smaller pieces of context for similarity search [00:08:20], [00:10:44].
  • Tools Infrastructure: Allows agents to automatically call external APIs or services [00:05:40], [00:11:33].

When agents with composable primitives that come with cloud-based capabilities (like auto-scaling memory), the result is a serverless AI agent that automates heavy lifting [00:05:19].

Common AI Agent Architectures Using Primitives

Various agent architectures can be constructed using these fundamental AI primitives [00:11:12], [00:11:14]:

1. Augmented LLM

This architecture involves an agent that takes input, generates output using an LLM, and has the ability to call tools, access threads for conversation history, and utilize long-term memory [00:11:18]. This is a foundational architecture for almost any type of AI agent [00:12:44].

2. Prompt Chaining and Composition

This involves multiple agents working together in sequence [00:13:06]. For example, one agent might classify an email as spam or not, and if not spam, another agent might draft a response [00:13:22]. Examples include summary agents, features agents, and marketing copy agents working in concert [00:13:35].

3. Agent Router (LLM Router)

In this architecture, a routing agent (an LLM) decides which specialized agent should be called next based on the input [00:14:00]. An example involves a router agent selecting between a summary agent (using Gemini), a reasoning agent (using Llama 70B), or a coding agent (using Claude Sonnet) [00:14:24].

4. Parallel Agents

Certain tasks allow for multiple agents to run simultaneously [00:16:47]. This is simplified in languages like JavaScript, where a set of agents (e.g., sentiment, summary, decision-maker) can be run concurrently using Promise.all [00:16:55].

5. Orchestrator-Worker Agents

This architecture involves an orchestrator agent that plans and creates subtasks, which are then completed by multiple worker agents. Finally, a synthesizer agent consolidates the results [00:17:13]. This mirrors deep research agent architectures [00:17:30]. For instance, an orchestrator might break down a request for a blog post into subtasks for introduction, specific sections (productivity, work-life balance, environmental impact), and conclusion, with each section handled by a separate worker agent [00:18:10].

6. Evaluator-Optimizer

In this pattern, an LLM (generator agent) produces a response (e.g., marketing copy), which is then evaluated by another LLM acting as a “judge” [00:20:08]. The judge either accepts the response or rejects it, providing specific feedback for improvement [00:20:22]. This iterative feedback loop helps optimize the output.

7. Memory Agent

A very common pattern where data is uploaded to a memory (e.g., PDF files), which is then parsed, chunked, and embedded. An AI agent can then query this memory to answer questions based on the stored data [00:00:12], [00:06:14], [00:21:50].

Practical Demonstrations of Primitive-Based Agents

Several real-world examples illustrate the power of building with primitives:

  • Chat with PDF: A basic AI agent built with memory (vector store), parser, and chunker primitives to allow users to chat with PDF content [00:00:26], [00:06:16].
  • Deep Researcher Agent (Perplexity-like): Analyzes a query, performs web searches, consolidates results, and generates a comprehensive response using tools like Exa [00:22:37].
  • Receipt Checker (OCR Agent): Utilizes an OCR primitive (e.g., from Mistral) and an LLM (e.g., GPT-4 Vision) to process images of receipts, extract information like total paid and city, and analyze results [00:23:16].
  • Chat with Image Agent: Uses a vision-capable LLM (e.g., GPT-4 Vision) to analyze an image provided via URL and answer questions about its content, such as a person’s expression [00:24:18].

These examples demonstrate that 80% of even the most complicated AI agents can be built using these AI primitive patterns [00:22:24], [00:22:29]. The core idea is to create simple, reusable code blocks that avoid complex abstraction layers, which could become a migration challenge as LLMs and agentic workflows continue to evolve [00:19:32].