From: aidotengineer
This article explores the evolution of AI, drawing on personal experiences from the mid-2010s to present-day advancements in AI agents and development methodologies at Sierra, a conversational AI platform for businesses [00:00:27]. Sierra, initially known for chat and customer service, is expanding into phone interactions and broader customer experience touchpoints, including sales, subscription management, and product recommendations [00:00:47].
The Early Days: AI “Caves” (2010s)
The speaker reflects on the “ancient history” of AI, noting that many discussions about AI’s past often only go back to 2019 or 2020 [00:01:40]. However, he emphasizes that 2016 felt like the “AI caves” despite AI’s origins in the 1970s [00:01:57].
Computer Vision Challenges (2016)
In 2016, working at Google with computer vision engineers, the primary challenge was helping computers differentiate between seemingly similar objects, such as a Chihuahua and a blueberry muffin, dogs and bagels, or dogs and fried chicken [00:02:13]. This work was the initial phase of building Google Lens [00:02:37].
At this time, one of the few consumer applications where computer vision models showed promise was identifying plants [00:03:08]. Testing these early models felt like a “slot machine” due to the non-determinism of inputs or outputs, a common experience for those building with AI [00:03:48].
Google Brain and Early Large Models (2012)
Rewinding further to 2012, an earlier “AI caves” period, a significant breakthrough occurred when Google Brain was able to watch cat videos and identify them on YouTube [00:05:21]. This model, with approximately a billion parameters, was considered enormous at the time, especially when compared to today’s frontier models that have about a trillion parameters [00:05:30].
Modern AI and the Agent Development Life Cycle
Today, Google Lens has vastly improved, allowing users to search, shop, translate non-Latin character sets, and even solve math homework using visual input [00:04:07]. This progress is attributed to consistent, step-by-step iteration over a decade [00:04:43].
The Software Development Life Cycle (SDLC) Parallel
The continuous improvement seen in AI engineering mirrors the established Software Development Life Cycle (SDLC), which emphasizes iterative processes like implementation, testing, maintenance, analysis, and design [00:04:57].
The Agent Development Life Cycle (ADLC) at Sierra
Sierra has adapted these principles to create the “Agent Development Life Cycle” (ADLC) for building and improving AI agents [00:12:12]. This is critical because building with large language models (LLMs) is like building on a “foundation of jello” [00:11:32].
Key distinctions between traditional software and LLMs:
- Traditional Software: Deterministic, fast, cheap, rigid, governed by if statements [00:11:43].
- LLMs: Non-deterministic, slow, expensive to run, flexible, creative, capable of reasoning [00:11:51].
The ADLC aims to leverage LLM strengths while integrating traditional software where beneficial [00:12:02].
Practical Application of ADLC: Chubbies
Sierra partnered with Chubbies to create an AI agent named Duncan Smothers [00:07:37]. Duncan is highly capable and personable, assisting customers with sizing and fit, inventory tracking, and package tracking and refunds [00:08:11]. This demonstrates “autonomous agents” that take action, not just answer questions, leading to more customers being helped quickly and with higher satisfaction [00:08:49].
Sierra treats “every agent as a product,” requiring a fully featured developer and customer experience operations platform [00:09:04]. They have dedicated agent engineering and product management teams working directly with customers [00:09:32].
Continuous Improvement and QA in ADLC
The ADLC emphasizes continuous improvement. If Duncan Smothers makes a mistake, such as providing incorrect inventory, an issue is filed, a test is created, and once the test passes, a new release is made [00:13:05]. Over time, a Sierra agent can go from a handful of tests at launch to hundreds and thousands [00:13:27]. Furthermore, agents can be designed to go “above and beyond,” such as having a budget to delight customers with unexpected services like DoorDashing shorts from a retail location [00:13:37].
A year prior, this entire process was manual [00:14:00]. Now, AI is being integrated into each part of the ADLC to speed up improvements [00:14:13]. The ADLC becomes even more effective for larger customers, where velocity and change management are crucial [00:14:26]. Changes also stem from external factors like model upgrades, new paradigms (e.g., reasoning models), and multimodality [00:14:52]. Reasoning models act as a “force multiplier,” making the application of AI more effective across development, testing, and QA [00:15:05].
Evolution of AI Interfaces: Voice Agents
Sierra launched voice capabilities in October, benefiting large customers like SiriusXM, who can now answer customer calls immediately [00:15:29]. Sierra’s approach to voice is akin to responsive web design: it’s the same underlying platform and agent code, but it adapts to the channel and modality of interaction [00:16:13]. Customization for phrasing and parallelizing requests for lower latency are still possible [00:16:27].
The Nature of Large Language Models (LLMs)
Building with LLMs is fascinating because they remind us of ourselves: they are unpredictable, slow, and not great at math [00:16:46]. However, this unpredictability allows for a unique form of empathy in design, enabling creators to put themselves in the “shoes of the robot” and build better experiences [00:17:03].
When building voice agents, a key question is how robust they are if only given transcribed text with delay [00:17:40]. Sierra aims to build robust and rich experiences by giving LLMs the same inputs and experiences that humans have [00:17:51].