From: redpointai
Intercom, a company specializing in customer support, rapidly adopted AI following the launch of ChatGPT. Dez Trainer, co-founder and Chief Strategy Officer of Intercom, described their journey from an immediate “all hands on deck” response to becoming a leader in AI adoption within the customer support industry [02:56:45].
Immediate Reaction to ChatGPT’s Launch
When ChatGPT launched around 5:00 PM in Dublin, Ireland, Intercom’s AI and ML team quickly recognized its potential [00:40:00]. Dez Trainer was alerted by his VP of AI, Fergo, who messaged him about the new technology [00:47:00]. Initial reactions involved playing with the tool and being impressed by its ability to answer obscure questions and generate creative content, like songs in the style of Rage Against the Machine [01:09:00].
The company recognized that customer support was “in the kill zone” of AI and Large Language Models (LLMs) due to their conversational abilities, fact-finding, and summarization capabilities [01:17:00]. This led to a critical discussion about whether to “rip up the entire AI/ML roadmap” to go “all in” on this new technology [01:46:00]. Intercom decided to move quickly, shipping their first AI-powered product before Christmas, a reasonable release in January, and then launching “Finn” in March, followed by a broader release in July [01:58:00]. This rapid pace was driven by the understanding that if they didn’t lead, someone else would, potentially leaving little room for others in the support industry [02:40:00].
Phased AI Product Development
Intercom adopted a “crawl, walk, run” approach to product rollout, starting with low-risk features and gradually expanding [00:22:00].
Initial AI Features (“Crawl”)
The first “tracer bullet” involved integrating “zero downside” AI features into their existing inbox product [03:32:00]. These included:
- Summarizing conversations [03:57:00]
- Translating messages for multilingual support [04:00:00]
- Expanding or collapsing text [04:04:00]
This approach ensured that if users didn’t like the AI-generated output (e.g., a summary), they could simply choose not to use the feature [04:40:00]. However, the immediate user demand for automatic summarization highlighted significant cost implications. Automatically summarizing 500 million conversations a month would be prohibitively expensive, leading them to keep it as an optional button in the UI [04:51:00].
Launching Finn (“Walk”)
The next major release was “Finn,” a user-facing chatbot, enabled by access to the beta of GPT-4 [05:17:00]. GPT-4 significantly helped in containing “hallucinations,” a major concern with GPT 3.5 [05:20:00].
Finn was designed to answer questions based on a high confidence threshold, with significant work put into ensuring it was trustworthy, reliable, and stayed on topic, avoiding inappropriate responses like political opinions or competitor recommendations [05:47:00].
Expanding Capabilities (“Run”)
Following Finn, “Inbox AI” was launched, building on the initial set of features with additions like adjusting the tone of voice to match Intercom’s standard tone [06:09:00].
Key aspects of developing these broader AI features included:
- Guardrails and Hallucination Prevention: This involved creating “torture tests” with extensive scenarios, questions, and contexts to observe the AI’s behavior and set internal weighting for acceptable misbehaviors [06:57:00]. The team learned to prioritize context provided over the LLM’s general knowledge [08:22:27].
- Model Selection: Intercom continuously evaluates various LLMs (GPT 3.5, GPT-4, Anthropic’s Claude, Llama) based on trust, cost, reliability, stability, uptime, malleability, and speed [08:52:00]. They foresee a future where customers might even choose their preferred model [09:36:00].
Cost Optimization vs. Exploration
Intercom remains in a “deep exploration mode” rather than focusing heavily on cost optimization [11:02:00]. Their primary goal is to build the “best possible product” by augmenting human agents, powering reporting, and enhancing message sending with AI [11:32:00]. They believe that technology will generally become cheaper and faster, and model improvements will inherently make their product better over time [14:43:00]. The shift to cost optimization will likely occur when LLMs plateau, similar to the “S-curve” concept where rapid acceleration gives way to incremental improvements [15:12:00].
Latency is a current forcing function that might drive the exploration of smaller, faster models [13:38:00].
Challenges and Missing Tools
Intercom has had to build many tools themselves due to a lack of off-the-shelf solutions [15:56:00]. Key missing areas include:
- Prompt Management: Tools for subtle prompt changes, re-running tests, versioning prompts across different models, and A/B testing [16:18:00].
- Robustness: Challenges with deploying AI in different regions (e.g., EU servers), leading to partnerships like the one with Microsoft Azure [17:00:00].
- Developer Experience: Opportunities for new tools and services to emerge, similar to how cloud computing spawned new categories in Ops and analytics [18:06:00].
A concern is OpenAI’s multifaceted role as a lab, an “AWS of AI,” and a consumer company (ChatGPT), which could lead to third-party tools being “sherlocked” [18:28:00].
Team Structure and Shipping Pace
Intercom maintains a centralized ML team of 17-20 people, comprising data scientists, AI, and ML engineers with deep domain expertise in building and training models [19:39:00]. This core team enables over 150 “regular product engineers” who then build user-facing features on top of the AI team’s endpoints [20:20:00].
Dez Trainer distinguishes between:
- AI-native startups: Working on the bleeding edge of AI [21:08:00].
- Companies that use AI: Rebuilding or creating new product categories whose existence depends on LLMs [21:18:00]. Intercom falls into this category [21:32:00].
- Companies that have a bit of AI: Applying AI as “salt and pepper” to enhance existing products [21:35:00].
For companies in the first two categories, a dedicated AI/ML team with deep expertise is crucial [22:03:00]. ML projects introduce a “second wave” of uncertainty: after design, there’s the question of whether the functionality is even possible, a question that might not have a clear “no” answer, leading to prolonged effort without guaranteed results [23:01:00]. Therefore, AI/ML work needs to be viewed as a “portfolio of bets,” with some high-probability and some low-probability endeavors [23:39:00].
An example of a challenging problem is agent-side sentence completion, which, despite appearing simple, struggles with distinguishing personal information and abstracting irrelevant context [24:50:00].
Customer Adoption and Future of AI in Support
Intercom’s strategy for customer adoption involves making it easy to “dip your toe in” rather than a “trust fall” [27:06:00]. They allow customers to test AI with specific user segments (e.g., free users, weekend support, specific query types) [27:22:00]. The value of instant and correct answers often leads customers to eventually go “all in” because their test groups receive better support than paid users [28:09:00].
The broader adoption of AI will be significantly influenced by major consumer tech companies like Apple and Google integrating LLMs into their products (e.g., Siri, Bard). This will normalize the idea of conversational software and make AI a competitive battleground in the B2B space [29:26:00].
Currently, Finn uses Retrieval Augmented Generation (RAG) and adjusts tone of voice via prompting rather than fine-tuning [31:32:00]. Intercom’s internal AI lab continually explores custom models and fine-tuning [32:11:00].
In the future, some support requests will be 100% handled by AI, especially in verticals like e-commerce with limited query types [33:17:00]. More complex products will see higher volumes and diversity of queries, making 100% automation harder but still achieving high percentages (e.g., 80-90%) [34:02:00].
Beyond text-based answers, AI will take actions, such as issuing refunds or canceling orders, which requires building significant software for authentication, monitoring, and data logging [35:06:00]. This could range from full automation to AI proposing actions for human approval [35:58:00].
Shifting Product Landscape
AI will completely disrupt certain product categories by making entire workflows automatic, eliminating the need for complex user interfaces [38:58:00]. For example, ad optimization software could become a background service with minimal user interaction [39:34:00].
Advice for Startups
Startups should target areas where the incumbent technology stack is irrelevant and would be built “entirely differently” if redesigned with AI [41:34:00]. This offers an advantage over incumbents who might only be able to add “cutesy little AI” features to existing complex systems [41:21:00].
Advice for Incumbents and Enterprise AI Application Deployment
Incumbents should:
- Find Asymmetric Upside: Start with simple AI features to understand costs and latency [42:17:00].
- Workflow Analysis: Break down their product into workflows and assess if AI can reliably perform them [42:38:00].
- Remove or Optimize: If AI can remove a workflow, it should be removed. If it can augment or simplify a workflow, it should be optimized [43:04:00].
- Sprinkle AI: Add AI enhancements even if not core to the workflow for completeness [43:51:00].
- Sell the Value: Focus on educating customers about the value of these AI transformations [44:12:00].
AI Trends: Overhyped and Underhyped
- Overhyped: Productivity tools that generate content (emails, sales pitches). Dez believes people will learn to detect AI-generated content, and filters will emerge, leading to a diminished value for such tools [44:30:00].
- Underhyped: The impact of AI on creativity. Tools like Kaiber, Refusion, and Synthesia, similar to how Instagram filters enabled everyone to feel like a photographer, are unlocking new forms of creativity in art, music, and video [44:56:00].
Impressions of Incumbent AI Adoption
- Most Impressed: Microsoft (beyond the obvious), Adobe (quick out of the blocks), Figma, and Miro for finding genuinely useful use cases instead of just putting AI on their homepage [46:08:00]. Shopify also received a mention for nice work [46:41:00].
- Most Disappointed: Apple and Amazon (Siri and Alexa). Despite their resources, their voice assistants seem primitive compared to advanced LLMs, highlighting a significant gap in conversational AI capabilities [46:48:00]. This perceived disparity between highly advanced generative AI (ChatGPT) and basic voice assistants creates a jarring experience for users [47:37:00].
Intercom emphasizes the importance of going “all in” on AI, starting with “zero downside” features, and continuously exploring new capabilities rather than solely focusing on cost optimization. They believe the broad adoption of AI by major tech companies will normalize its use and shift the competitive landscape significantly.