From: aidotengineer
The rapid advancements in AI are ushering in a new era of collaboration between humans and artificial intelligence, particularly in the field of engineering. This shift is compared to the impact of microprocessors or the transition to Software as a Service (SaaS), where intelligence becomes “too cheap to meter,” leading to increased expectations for AI’s capabilities [00:02:06]. At DataDog, this means evolving from a platform solely for human use to one where AI agents leverage the platform on behalf of users [00:02:58].
AI Agents as Collaborators
DataDog is developing AI agents designed to act as partners or “co-workers” for engineers, similar to how human teammates would operate [00:00:39] [00:11:28]. These agents are built to handle specific, vertical tasks rather than being generalized, making their performance measurable and verifiable at each step [00:08:48] [00:08:52].
The AI On-Call Engineer
One such agent is the AI On-Call Engineer, which aims to reduce the need for human engineers to wake up for 2 AM alerts [00:03:48]. This agent:
- Proactively starts an investigation when an alert occurs [00:04:04].
- Situates itself by reading runbooks and gathering context [00:04:07].
- Performs common troubleshooting steps like looking through logs, metrics, and traces [00:04:16].
- Automatically runs investigations and provides summaries before a human even reaches their computer [00:04:26].
- Operates by forming hypotheses, testing them with tools, and validating or invalidating each one, much like a human Site Reliability Engineer (SRE) or DevOps engineer [00:05:31].
- Can suggest remediations, such as paging another team or scaling infrastructure [00:05:53].
- Integrates with existing DataDog workflows, allowing the agent to understand and map to remediation processes [00:06:14].
- Can even write postmortems after an incident is remediated, summarizing what occurred and the actions taken by both humans and the agent [00:06:27].
The AI Software Engineer
This agent acts as a proactive developer or DevOps/software engineering assistant [00:06:55]. It observes and acts on errors, automatically analyzing them, identifying causes, and proposing solutions [00:07:05]. This can include generating code fixes and creating tests to prevent future occurrences, significantly reducing manual work for human engineers [00:07:12] [00:07:38].
Fostering Human-AI Collaboration
A key aspect of building these agents is enabling effective human-AI collaboration. This involves designing systems where humans can:
- Verify and Oversee: Humans need to be able to verify what agents have done and oversee their actions to learn from them [00:04:56].
- Build Trust: Transparency in the agent’s reasoning helps earn trust. Users can see how hypotheses were generated and what the agent found, allowing them to agree or disagree with decisions [00:05:03].
- Ask Follow-Up Questions: The agent can be treated like a “junior engineer” who has performed work, allowing human engineers to ask questions about the steps taken or reasoning [00:05:22].
- UI/UX Considerations: The user experience (UX) is crucial for effective collaboration with agents and assistants. Old UX patterns are changing, and designs should prioritize agents working more like human teammates [00:10:28] [00:11:15].
Lessons Learned in Building AI Agents
Developing AI agents for collaboration has provided several key learnings:
- Scoping Tasks for Evaluation: It’s easy to build demos, but much harder to properly scope and evaluate an agent’s performance. Tasks should be clearly defined and measurable, focusing on vertical, task-specific agents rather than generalized ones [00:08:01] [00:08:48].
- Prioritizing Evaluation: Deeply thinking about evaluation (offline, online, and living) from the start is critical. It’s easy to create demos that look like they work, but rigorous evaluation is needed for improvement over time [00:09:31]. Domain experts should serve as design partners and task verifiers, not as sole rule-setters [00:09:10].
- Building the Right Team: While some ML experts are needed, a team of optimistic generalists who can code fast and embrace ambiguity is vital. Team members should be excited to be AI augmented themselves, as the field is rapidly changing [00:10:11] [00:10:41].
- Observability for AI: For complex agent workflows, robust observability is essential for debugging and situational awareness. This includes specialized “LLM observability” to track interactions and calls to models, regardless of how they are hosted or accessed [00:11:36] [00:12:05]. Visual tools like an “agent graph” can help understand multi-step calls and identify issues in complex decision-making processes [00:12:46].
- Leveraging General Methods: The “bitter lesson” of AI engineering suggests that general methods leveraging new, off-the-shelf models are often the most effective. It’s crucial to remain agile and not be stuck to a specific model, as new advancements can quickly solve reasoning challenges [00:13:16] [00:13:45].
The Future of Human-AI Collaboration
It is anticipated that AI agents may surpass humans as users of platforms and products within the next five years [00:14:07]. This means companies should not only build for human users or their own agents but also consider how third-party agents might directly interact with their platforms, providing them with necessary context and API information [00:14:21].
In the future, a team of DevSecOps agents may be available “for hire,” handling on-call duties and platform integration directly [00:14:56]. This collaborative ecosystem will enable a significantly greater number of ideas to be brought to life, with automated developers like Cursor or Devin for creation and agents for operations and security [00:15:25].