From: aidotengineer

The field of artificial intelligence has seen rapid advancements, with many reflections on its history often focusing on recent years, sometimes as far back as 2019 or 2020 [00:01:33]. Even these recent years are considered “ancient history” in the fast-moving AI landscape [00:01:45].

Early AI “Caves”

While AI’s roots stretch back to the 1970s [00:01:57], the period around 2012 and 2016 felt like the “AI caves” due to the nascent state of technology [00:01:51], [00:02:01], [00:05:12].

2012: Google Brain and Model Scale

Around 2012, a significant breakthrough occurred when Google Brain successfully identified cat videos on YouTube [00:05:24]. This model, with approximately one billion parameters, was considered a huge advancement at the time [00:05:32]. For context, modern frontier models can have about a trillion parameters, making the 2012 model 1,000 times smaller [00:05:36]. During this era, there was also a prevailing theory that computers would be limited in their achievements, a theory less popular today [00:05:49].

2016: The Infancy of Google Lens

In 2016, efforts in computer vision focused on helping computers differentiate between visually similar objects, such as Chihuahuas and blueberry muffins, dogs and bagels, or dogs and mops [00:02:15]. This work laid the foundation for the first version of Google Lens [00:02:37]. In its early stages, one of the few consumer applications where computer vision models excelled was identifying plants [00:03:05]. However, early AI models were often unpredictable, feeling like a “slot machine” where results were inconsistent, highlighting challenges in AI development related to the non-determinism of inputs or outputs [00:03:48].

Present Day Advancements (2024)

Today, Google Lens has evolved significantly beyond its initial capabilities [00:04:07]. Users can now search and shop based on what they see, use it on Google Images and YouTube, translate non-Latin character sets, and even solve math homework [00:04:11]. Despite these advancements, it still retains its original ability to identify flowers [00:04:36]. This extensive progress is attributed to consistent, step-by-step iteration over a decade [00:04:43].

Software Development Life Cycle (SDLC) vs. Agent Development Life Cycle (ADLC)

The iterative improvement of AI, similar to general software, relies on a structured process [00:04:52]. The traditional Software Development Life Cycle (SDLC) involves continuous improvement through implementation, testing, maintenance, analysis, and design [00:05:00].

However, developing with large language models (LLMs) presents unique challenges in AI development, as they are often described as “building on top of a foundation of jello” [00:11:34].

Traditional Software vs. Large Language Models:

FeatureTraditional SoftwareLarge Language Models
DeterminismDeterministic [00:11:43]Non-deterministic [00:11:51]
SpeedFast [00:11:43]Slow [00:11:53]
CostCheap [00:11:45]Expensive to run [00:11:53]
FlexibilityRigid (governed by if statements) [00:11:45]Very flexible [00:11:55]
CapabilitiesFollows logic [00:11:48]Creative, can reason through problems [00:11:57]

To address these differences, Sierra developed the Agent Development Life Cycle (ADLC) [00:12:12]. This methodology leverages the strengths of LLMs while integrating traditional software where beneficial [00:12:02]. The ADLC applies AI to each part of the life cycle, speeding up improvements [00:14:13]. Reasoning models, for example, act as a force multiplier, making AI more effective in development, testing, and QA [00:15:06].

The Rise of AI Agents

As businesses recognize the necessity of AI agents to represent their operations and assist customers, companies like Chubbies have partnered with Sierra to deploy conversational AI platforms [00:07:02]. These agents, like “Duncan Smothers,” are designed to be highly capable, handling tasks such as sizing inquiries, inventory tracking, package tracking, and refunds [00:07:42]. The goal is for these agents to perform autonomous actions, not just answer questions, leading to higher customer satisfaction [00:08:49].

Every AI agent is treated as a product, requiring a fully featured developer and customer experience operations platform for optimal results [00:09:06]. Continuous iteration is key; even if an agent isn’t perfect initially, it should consistently improve [00:11:19]. This involves a rigorous quality assurance process where customer feedback leads to issue filing, test creation, and new releases, resulting in agents having thousands of tests over time [00:12:45].

Voice Agents and Multimodality

The evolution of AI models and their application also includes the development of voice agents, which launched in October 2023 [00:15:31]. The approach to voice agents is akin to responsive web design, where a single underlying platform and agent code can adapt to different channels and modalities [00:16:13]. This allows for customization in phrasing and parallelized requests for lower latency [00:16:27].

The fascinating aspect of building with AI is that LLMs, despite being unpredictable, slow, and not great at math, allow designers to have empathy in a new way [00:16:46]. This enables creators to better understand how to build robust and rich experiences for AI agents, even in challenging scenarios like real-time voice interactions with delays [00:17:40].