From: aidotengineer

AI agent development is undergoing a significant shift, moving away from traditional AI application frameworks and architecture towards the use of foundational AI primitives [00:01:31]. This approach is advocated by Ahmed, who has extensive experience in software development, having contributed to major platforms like WordPress, Next.js, Node.js, and React, and even NASA helicopter missions [00:01:37]. His journey in LLMs began in 2020 with early access to GPT-3, leading to work on projects like GitHub Copilot [00:02:54].

The Case Against AI Frameworks

Many production-ready AI agents, including those by Perplexity and Cursor, are not built on top of AI application frameworks and architecture [00:01:13]. This is because frameworks are often seen as:

The rapid pace of AI development, with new paradigms, LLMs, and solutions emerging weekly, makes sticking to a fixed framework challenging [00:25:51]. Building with frameworks can lead to being “stuck” with an abstraction that might not adapt to future advancements in agentic workflows [00:19:44].

The Power of AI Primitives

AI primitives are small, composable building blocks that work exceptionally well in production environments [00:02:29]. They offer:

  • Native scalability: Similar to Amazon S3, primitives are low-level and inherently scalable [00:02:31].
  • Simplicity: They are straightforward, with “no magic, nothing to learn” [00:15:42].
  • Composability: Allowing engineers to combine them to build complex AI agents [00:03:56].
  • Production readiness: When primitives come with built-in cloud capabilities (like memory with auto-scaling vector stores), they enable serverless AI agents that handle heavy lifting automatically [00:05:17].

::startcallout

All engineers are becoming AI engineers.

Ahmed believes that most engineers, including full-stack, web, front-end, DevOps, and ML engineers, are transitioning into AI engineering roles, actively shipping products with AI [00:04:17]. ::endcallout

Key AI Primitives

Essential AI primitives for building agents include [00:10:23]:

  • Memory: An autonomous “rag engine” (retrieval-augmented generation) that can store and retrieve terabytes of data, often with an integrated vector store for similarity search [00:05:24].
  • Threads: For storing and managing context and conversation history within an agent [00:04:01].
  • Parser: To extract context from various file formats (e.g., PDF to text) [00:10:41].
  • Chunker: To split extracted context into smaller, manageable pieces for similarity search [00:10:42].
  • Tools Infrastructure: Enables agents to automatically call external APIs or interact with other systems [00:11:33].
  • Workflow Engine: Purpose-built for multi-step agent processes [00:10:30].

Langbase provides these primitives, aiming to be the fastest way to build production-ready AI agents [00:04:47].

Common AI Agent Architectures Using Primitives

Ahmed demonstrates several common AI agent architectures built purely with AI primitives [00:06:06]:

1. Augmented LLM

This is a fundamental agent that takes input and generates output using an LLM. It integrates tools, threads (for short-term context/scratchpad), and memory (for long-term storage and retrieval of large datasets) [00:11:18]. This architecture can be used to build almost any type of AI agent [00:12:47].

2. Prompt Chaining and Composition

This architecture involves multiple agents working together in a sequence. An input is processed by one agent, and its output determines the next step, potentially involving another agent. Examples include a summary agent feeding into a features agent, then a marketing copy agent [00:13:06]. This is implemented using plain JavaScript/TypeScript code [00:13:41].

3. Agent Router (LLM Router)

Here, a routing agent (or LLM router) decides which specialized agent should be called next based on the input query [00:14:00].

  • Example: Three specialized agents (summary, reasoning, coding) built with different LLM models (Gemini, Llama 70B, Claude Sonnet) are available. A routing agent is configured to select the appropriate one based on the user’s task [00:14:24]. The router’s job is to respond with valid JSON indicating the next agent [00:15:06].

4. Running Agents in Parallel

For tasks that can be broken down into independent sub-tasks, multiple agents can be run concurrently. JavaScript’s Promise.all function can easily orchestrate this without complex abstractions [00:16:47]. An example includes running sentiment analysis, summarization, and a decision-maker agent in parallel [00:17:00].

5. Agent Orchestrator Worker

This pattern, resembling deep research agent architectures, involves an orchestrator agent that plans and creates subtasks. These subtasks are then completed by multiple worker agents in parallel, and their results are synthesized by another agent [00:17:10].

  • Example: Writing a blog post on remote work benefits (productivity, work-life balance, environmental impact) [00:18:10].
    • The orchestrator breaks this into subtasks: “write introduction,” “write section on productivity,” etc. [00:18:21].
    • Multiple worker agents execute these subtasks concurrently [00:18:49].
    • Finally, a synthesizing agent combines the results into a cohesive blog post [00:19:02].

6. Evaluator Optimizer

An agent generates a response (e.g., marketing copy), which is then evaluated by another LLM acting as a “judge.” This judge either accepts the response or rejects it with specific feedback. The feedback is then used to refine the generation in subsequent iterations [00:20:08]. This iterative process allows for continuous improvement of generated content [00:21:32].

7. Memory-Based Agents

A common pattern involves creating a memory for storing data (e.g., PDFs, web pages). This data is parsed, chunked, and embedded using primitives, allowing an AI agent to perform similarity searches and answer questions based on the stored information [00:06:16].

  • Example 1: Chat with PDF: Uploading PDF documents to a memory, then asking questions that retrieve and synthesize answers from the stored content [00:06:51].
  • Example 2: Receipt Checker: Using OCR (Mistral) to process an image of a receipt, extracting relevant information like total paid and city [00:23:13].
  • Example 3: Chat with Image: Analyzing an image URL using a vision-capable LLM (GPT-4o) to describe content, expressions, etc. [00:24:18].

These patterns of AI Native Development leveraging primitives can cover approximately 80% of complex AI agents [00:22:24].

::startcallout

Chai for Vibe Coding AI Agents

Chai is a platform that allows users to “vibe code” AI agents by building them on top of AI primitives instead of frameworks [00:00:31]. It simplifies the process of building, deploying, and using agents [00:26:36]. ::endcallout