From: aidotengineer

Anthropic is an AI safety and research company focused on building highly capable and safe large language models (LLMs) [00:01:26]. Their mission involves not only releasing advanced models like Sonnet 3.5 [00:01:53], but also leading in safety techniques, research, and policy [00:01:43]. Sonnet is particularly noted for its leadership in the code space, topping leaderboards for agentic coding evaluations like Sbench [00:02:04].

Anthropic’s Approach to Customer Success

Anthropic’s applied AI team works closely with customers on technical implementation and brings insights back to product and model research [00:00:21]. They support the technical aspects of use cases, helping to design architectures, evaluations, and tweak prompts to get the best out of their models [00:09:16]. This approach is based on hundreds of customer interactions, aiming to provide actionable insights for implementing AI and best practices [00:01:14].

Solving Core Business Problems with AI

Anthropic encourages customers to use AI to solve the fundamental problems their products address, moving beyond basic applications like chatbots and summarization [00:05:22]. While these can be helpful, the focus should be on placing bigger bets [00:05:42].

For example, an onboarding and upskilling platform aims to help users ramp up quickly and advance their careers [00:05:49]. Instead of just summarizing course content or offering a Q&A chatbot, AI could be used to:

  • Hyper-personalize course content based on each individual employee’s context [00:06:18].
  • Dynamically adapt course content to make it more challenging for fast learners [00:06:26].
  • Dynamically update course material based on learning styles (e.g., creating visual content for visual learners) [00:06:33].

AI is making an impact across various industries, including taxes, legal, and project management, by drastically enhancing the customer experience [00:07:14]. This means making products easier to use and more trustworthy, leading to high-quality outputs where hallucination is unacceptable, such as in tax preparation [00:07:27].

Case Study: Intercom’s Finn AI Agent

Intercom, an AI customer service platform, developed an AI agent called Finn [00:10:58]. Anthropic collaborated with Intercom’s data science team on a two-week sprint, comparing Intercom’s most difficult prompt for Finn against a prompt refined with Claude [00:11:27].

This initial success led to a two-month optimization sprint, fine-tuning all of Intercom’s prompts for Claude’s best performance [00:11:43]. The results showed Anthropic’s LLM outperforming Intercom’s existing model [00:11:57]. Intercom’s resolution-based pricing model further incentivized the model to be helpful in solving customer problems [00:12:02].

Upon launching Finn 2, Intercom achieved impressive metrics:

  • Can solve up to 86% of customer support volume, with 51% resolution out of the box [00:12:22].
  • Improved human-like interaction, allowing adjustment of tone and answer length [00:12:35].
  • Enhanced policy awareness, such as for refund policies, unlocking new capabilities [00:12:45].

Best Practices for AI Implementation and Success

When implementing AI, several best practices and common pitfalls should be considered:

1. Testing and Evaluation

A common mistake is building a robust workflow and then only later considering evaluations [00:13:28]. Evaluations should direct the path towards a perfect outcome and ideally be built from the outset or very shortly after [00:13:38].

It’s crucial to design representative test cases, including “silly examples” that a user might ask but are unrelated to the product, to ensure the model responds appropriately or reroutes the question [00:15:59]. Customers should invest in telemetry to back-test their architecture and understand the “latent space” of their use cases [00:15:32]. Evaluations are considered intellectual property, key to competitive advantage in navigating and optimizing AI solutions [00:15:17].

2. Identifying Metrics and Trade-offs

Organizations often face an “intelligence-cost-latency triangle” when optimizing AI solutions [00:16:16]. While it’s difficult to meet all three, the balance should be defined in advance for the specific use case [00:16:32].

For example:

  • Customer Support: Latency is critical; a response needed within 10 seconds to prevent customer abandonment [00:16:40]. User experience strategies, like “thinking boxes” or redirecting customers to other pages, can manage perceived latency [00:17:21].
  • Financial Research: High intelligence and accuracy are paramount, even if it means a longer response time (e.g., 10 minutes) because the stakes (capital allocation) are very high [00:16:55].

The stakes and time sensitivity of the decision should drive optimization choices [00:17:10].

3. Fine-tuning Considerations

Fine-tuning is not a “silver bullet” [00:17:58]. It’s akin to “brain surgery on the model,” which can limit its reasoning in areas outside of what it’s fine-tuned for [00:18:06]. It’s recommended to try other approaches first, ensuring clear success criteria are established beforehand [00:18:16]. The cost and effort of fine-tuning must be justified by a clear difference in performance for the specific intelligence domain [00:18:41].

“Don’t let fine tuning slow you down… no no pursue it and then realize that you need to do fine tuning and then you can just sub in the fine tuned model and then explore other methods first.” [00:18:56]

4. Other Optimization Methods

Beyond basic prompt engineering, several other features and architectures can drastically change the success of a use case [00:19:28]:

  • Prompt Caching: Can lead to significant cost reductions (e.g., 90%) and speed increases (e.g., 50%) without sacrificing model intelligence [00:19:47].
  • Contextual Retrieval: Improves the performance of retrieval mechanisms, feeding information more effectively to the model and reducing processing time [00:19:54].
  • Citations: An out-of-the-box feature that enhances reliability [00:20:09].
  • Agentic Architectures: Architectural decisions that can significantly impact performance [00:20:13].

Getting Started with Anthropic’s AI

Anthropic offers several avenues for businesses to access their models:

  • API: For businesses looking to embed AI into their products and services [00:08:10].
  • Claude for Work: Empowers entire organizations to leverage AI in their day-to-day operations [00:08:14].
  • Cloud Partnerships: Access to Frontier models via AWS Bedrock or Google Cloud’s Vertex AI, allowing deployment in existing environments without managing new infrastructure, thereby reducing barriers to entry [00:08:22]. Support is consistent regardless of the access method [00:08:47].