From: aidotengineer

Diamond, who has been working in AI for about 15 years, discusses the advancements in AI and their implications, focusing on DataDog’s development of AI agents [00:00:36]. His career includes work at Microsoft Cortana, Amazon Alexa, Meta (PyTorch), and his own AI startup focused on a devops assistant [00:00:54].

DataDog and AI Agents

DataDog, known as an observability and security platform for cloud applications, has been incorporating AI since around 2015 [00:01:22]. While not always overtly branded as AI, features like proactive alerting, root cause analysis, impact analysis, and change tracking have utilized AI [00:01:56].

The current landscape represents a clear “era shift” in AI, comparable to the advent of microprocessors or the move to Software-as-a-Service (SaaS) [00:02:09]. This shift is characterized by bigger, smarter models, enhanced reasoning capabilities, multimodal AI, and a “Foundation model Wars” where intelligence becomes “too cheap to meter” [00:02:14]. People now expect more and more from AI [00:02:35].

DataDog is leveraging these advancements in AI model technology and performance to move up the stack, aiming for customers to use DataDog not just as a devops platform, but as AI agents that utilize the platform on their behalf [00:02:53]. This involves developing agents, conducting evaluations, and building new types of observability [00:03:06].

Current AI Agents in Private Beta

DataDog is developing several AI agents in private beta [00:03:22]:

  • AI Software Engineer: This agent analyzes problems and errors, recommends and generates code to improve systems [00:03:25]. It can catch issues like recursion, propose fixes, create tests, and generate pull requests or open diffs in VS Code [00:07:22]. This significantly reduces the time engineers spend manually writing and testing code [00:07:38].
  • AI On-Call Engineer: Designed to automate tasks for on-call engineers, this agent activates upon an alert, gathers context from runbooks, and investigates issues by analyzing logs, metrics, and traces [00:03:34]. It can automatically run investigations, provide summaries before a human even reaches their computer, and offer insights into alert causes or trace errors [00:04:26]. This agent can also suggest remediations, such as paging other teams or scaling infrastructure [00:05:51]. After an incident, it can write post-mortems by reviewing all actions taken by both the agent and humans [00:06:26].

Human-AI Collaboration

DataDog emphasizes human-AI collaboration, providing an interface where users can verify agent actions, learn from their processes, and build trust [00:04:47]. Users can see the reasoning behind an agent’s hypotheses, what it found, and ask follow-up questions [00:05:05]. The agent operates by forming hypotheses, testing them with tools (like querying logs or metrics), and validating or invalidating them [00:05:30]. Existing DataDog workflows can be integrated so agents understand and utilize them for remediation [00:06:11].

Lessons Learned in AI Engineering

Diamond shares key lessons from building these AI agents, relevant to the future of AI engineering and challenges and innovations in AI engineering:

  • Scoping Tasks for Evaluation: It’s crucial to define “jobs to be done” clearly and understand tasks step-by-step from a human perspective [00:08:01]. Building vertical, task-specific agents is preferred over generalized ones [00:08:48]. Measurability and verifiability at each step are essential, as demos can be misleading without proper evaluation [00:08:52]. Domain experts should act as design partners or task verifiers, not as code or rule writers, due to the stochastic nature of AI models [00:09:10].
  • Evaluation (Eval): Deeply thinking about evaluation is paramount [00:09:31]. In the “fuzzy stochastic world” of AI, good evaluation is necessary, even starting small with offline, online, and living evaluations [00:09:48]. End-to-end task measurements and appropriate instrumentation to gauge human usage and feedback are vital for a “living breathing test set” [00:09:57].
  • Building the Right Team: While ML experts are valuable, the team should be seeded with one or two, augmented by optimistic generalists who are proficient at coding and willing to iterate quickly [00:10:15]. UX and front-end development are surprisingly critical for effective human-agent collaboration [00:10:28]. Teammates should be excited by the prospect of being AI-augmented themselves and eager to learn in a rapidly changing field [00:10:38].
  • Evolving UX: Traditional UX patterns are changing, and being comfortable with this shift is important [00:11:03]. The focus is on agents that behave more like human teammates rather than requiring new pages or buttons [00:11:28].
  • Importance of Observability: For complex AI workflows, observability is crucial, not an afterthought [00:11:36]. Situational awareness is key to debugging problems [00:11:44]. DataDog’s “LM Observability” view helps monitor interactions and calls to models, whether hosted, running, or via APIs, in a unified view [00:11:50]. Agent workflows can involve hundreds of complex, multi-step calls, making specialized “agent graph” views essential for human readability and debugging [00:12:26].

The “Bitter Lesson” of AI

A significant lesson is that general methods leveraging new off-the-shelf models are often the most effective [00:13:15]. While fine-tuning for specific tasks is common, new foundational models can quickly solve many reasoning problems, making it important for systems to easily switch between models and not be tied to one [00:13:29].

Future Implications

Looking ahead, the future of AI is rapidly accelerating, impacting future prospects in AI and agent-based technologies and applications and future of AI technology.

  • AI Agents as Users: There’s a strong belief that AI agents will surpass humans as users of SaaS products like DataDog within the next five years [00:14:01]. This means product developers should consider providing context and API information optimized for agents, not just humans [00:14:21].
  • Agents For Hire: DataDog anticipates offering “a team of DevSecOps agents for hire” in the near future, where agents will handle platform integration and on-call responsibilities directly for users [00:14:56].
  • Empowering Small Companies: Small companies will likely be built by individuals leveraging automated developers (like Cursor or Devon) to bring ideas to life [00:15:25]. Agents like DataDog’s will then handle operations and security, enabling an order of magnitude more ideas to reach the market [00:15:32].

These future prospects in AI and agent-based technologies highlight a transformative period where AI not only assists but also operates autonomously, changing how software is developed and maintained.