Developing AI agents for productivity

From: aidotengineer

Twilio’s approach to developing AI agents is centered on enhancing productivity rather than replacing human roles [00:00:26]. The goal is to leverage AI to work smarter, not to instill fear about job displacement [00:00:13]. This strategy focuses on turbocharging workflows within a small team, like Twilio’s docs team [00:00:21].

The Challenge: Pain Points for a Tiny Docs Team

Twilio’s small documentation team faced significant challenges with a high volume of Jira tickets [00:01:05]:

Error-prone first drafts from over 100 product teams [00:01:12].
Time-consuming grooming tasks, such as style checks, alt text generation, and SEO optimization [00:01:17].
Hallucination risk if AI models were used without proper controls [00:01:23].

The team needed leverage to manage the workload without experiencing burnout [00:01:26].

The Solution: A Fleet of Single-Purpose AI Agents

Instead of building one monolithic “megabot,” Twilio opted to build six single-purpose AI agents [00:01:28]. These agents operate behind a simple Next.js frontend [00:01:34]. The “sweet spot” for an AI helper involves tasks that are repeatable, high-volume, and low-creativity [00:02:16].

Agent Breakdown

Each agent is designed to tackle a repetitive, well-scoped job, allowing humans to focus on judgment and clarity [00:02:10]:

Automated Editor: Fixes grammar, formatting, and accuracy in documentation [00:01:37].
Image Alt Text Generator: Provides instant accessibility by generating alt text for images [00:01:44].
Jargon Simplifier: Translates technical developer language into plain English [00:01:48].
SEO Metadata Generator: Creates title and description metadata, ensuring character count compliance [00:01:53].
Docs Outline Builder: Recommends navigation and structure for documentation (coming soon) [00:01:58].
Slack Backbot: Helps triage requests received through help channels [00:02:05].

Architecture and Workflow

The workflow for these AI agents involves a layered approach that ensures accuracy and reduces errors [00:02:29]:

Next.js UI: A user interface initiates requests [00:02:31].
Custom GPT-401 Agent: The appropriate model is used for each specific job [00:02:37]. This custom GPT incorporates Twilio’s style guide and rubric, which are retrieved from an AirTable for easy collaboration [00:02:44].
Validation Layer: Includes Veil linting and CI/CD tests [00:02:56].
GitHub PR: Adds code owner review, making it easier to scrutinize agent-suggested changes [00:03:03].
Human Merge: A human merges the changes only when they are correct, typically after product and engineering reviews [00:03:12]. This ensures multiple human eyes review changes before they are finalized [00:03:21].

This layered approach significantly reduces hallucinations without slowing down the process [00:03:27].

The AI Docs Buddy in Action (Demonstration)

The AI Docs Buddy, led by Maria Bermudez, showcases the practical application of these agents [00:03:44]. The platform allows users to access various agents from an overview or release page [00:03:55].

Automated Editor

The automated editor can load an MDX or Markdown file, or plug in a live URL [00:04:06]. It uses the GPT-401 model, which was chosen after experimentation for its consistent application of the style guide and rubric [00:04:19]. The output shows a diff of changes and a “changes made” tab that lists original text, revised text, and the specific style guidance applied [00:04:34]. While powerful, it’s noted that it’s “not perfect” and may occasionally miss things like missing SEO descriptions [00:04:51].

SEO Metadata Generator

This agent generates a meta title and meta description for a given page, accounting for character limitations [00:05:41].

Alt Text Generator

Similar to the editor, the alt text generator can process a live URL or multiple selected pages [00:05:55]. It quickly generates alt text that conforms to the required format for the docs platform [00:06:15].

Jargon Simplifier

This tool is useful for simplifying complex text, assisting in writing and reviewing pull requests [00:06:28]. It provides a diff view and a revised text tab for quick copying and application as a pull request comment or direct file edit [00:06:56].

The team is also working towards enabling these agents to communicate with each other [00:07:13].

Mitigating Risks with Guard Rails

Building effective AI agents requires robust guard rails to ensure quality and mitigate risks [00:07:23]:

Hallucinations: Mitigated using tools like Veil Lint and CI tests, combined with human oversight from various stakeholders [00:07:29].
Bias: Addressed through data set tests and prompt audits [00:07:40].
Stakeholder Misalignment: Managed with weekly PR reviews (which can be compressed to days or hours) and Slack feedback loops, particularly with product managers and engineering teams [00:07:46]. These feedback cycles allow continuous prompt tuning [00:08:03].

Playbook for Success

Twilio recommends a three-step playbook for other teams looking to develop AI agents for productivity [00:08:11]:

Identify a single pain point that is hindering throughput [00:08:14].
Pick a single task that is repeatable and rule-based [00:08:20].
Loop with your users weekly (at least) through a “ship, measure, and refine” process [00:08:22].

By stacking these small wins, teams can significantly increase their velocity [00:08:28].

Conclusion

Twilio’s experience demonstrates that AI agents can be embraced to enable smarter work, addressing common pain points and improving AI agent task execution through focused, single-purpose tools and robust human oversight [00:00:15].

Tubegraph

Explorer

Table of Contents