Beating Google in AI Translation

From: redpointai

DeepL, a company recently valued at $2 billion, supports over 100,000 businesses globally with AI translation services [00:00:04]. Yerkoyski, CEO and founder of DeepL, discussed how the company has become a leader in the AI translation space, particularly in competition with giants like Google [00:01:44].

DeepL’s Journey and Competitive Edge

DeepL has been engaged in cutting-edge AI research long before it became a mainstream topic [00:00:43]. The release of ChatGPT was a significant moment for the company, as it brought widespread public awareness to the capabilities of AI, specifically language models [00:01:34]. DeepL views competition, particularly with Google Translate, as beneficial, as it drives innovation and leads to better outcomes for users [00:01:50]. This competitive drive fostered a sense of urgency within the company [00:02:16].

Yerkoyski attributes DeepL’s success against larger competitors like Google to several key factors:

Speed to Market and Continuous Innovation [00:08:13].
Strong Academic Research Combined with Company Building [00:08:25].
Specialization on high-value translation use cases for businesses [00:08:36].
Geographic Advantage: Being established in Europe, a region with many different languages, provided the team with a deep understanding and motivation to solve complex translation problems [00:08:46].

The “Build It Yourself” Philosophy

DeepL maintains a “build it yourself” approach, owning the entire vertical stack from product development to engineering and research [00:09:59]. This includes building their own models and tooling, and even operating their own data centers [00:10:11]. This vertical integration allows them to better understand and solve customer problems by controlling all model parameters, training, and architecture [00:10:49].

One example of this feedback loop between application and model is the ability for customers to embed terminology directly into the models [00:11:38]. This crucial feature allows businesses to control the language used in their documents and manuals, even handling grammatical rules and contextual meanings for specific words [00:11:57].

DeepL uses different sets of models depending on language pairs and data availability, sometimes grouping similar languages or bundling models for operational efficiency [00:12:45]. They prioritize smaller, more efficient models for individual language pairs, especially given the scale of their operations [00:14:04].

Role of Human Expertise

Human data and translators play an increasingly vital role in DeepL’s development [00:14:48]. They conduct extensive in-house data annotation projects with human translators to train models and perform quality assurance [00:15:16]. This is particularly important for specialized models where customer expectations for consistent quality are high [00:15:33]. They hire native speakers globally to ensure the highest quality of translation [00:16:09]. While they currently do data labeling in-house, they are considering outsourcing parts of it in the future [00:17:10]. The decision to keep it in-house stems from the need for control and high quality, especially for tasks where specific individual expertise is crucial [00:17:21].

Current State and Future of AI Translation

AI translation has advanced significantly, especially for well-resourced language pairs like German-English, where a lot of training data is available [00:30:51]. Use cases such as automatically translating newspaper articles or one-to-one communication like emails are considered largely “solved” for these languages [00:31:26].

However, for high-stakes applications like publishing a marketing website for a billion-dollar company or creating operating manuals for nuclear power plants, human-in-the-loop oversight is still recommended to ensure accuracy and accountability [00:31:50].

The Next Frontier: Spoken Language Translation

The next major frontier for DeepL is spoken language and voice translation [00:39:14]. While AI-based text translation has revolutionized how content from abroad is consumed, synchronous speech translation for conversations is still a developing area [00:39:34]. DeepL is conducting its own research to bring this to a usable product integrated everywhere [00:40:01].

The widespread availability of real-time, ubiquitous translation models for speaking could dramatically change business operations, allowing for seamless communication across different languages and locations, potentially bridging geographical distances for co-founders and teams [00:40:41]. It would also democratize access to education, learning resources, and knowledge across linguistic barriers [00:41:13].

The main challenges for synchronous speaking models include latency and the inherent ambiguity and unstructured nature of spoken language compared to text [00:46:03]. Teaching models to translate different types of spoken language is essential [00:46:46].

Impact on Language Learning

Regarding the future of AI in human communication and AI language learning platforms, while AI could make learning to speak a language less expensive and more accessible by providing fluent conversation partners, the personal and cultural value of learning languages will likely remain [00:47:25]. Although the average person might learn fewer languages in the future due to advanced translation tools, those who do will pursue it out of personal interest and intellectual challenge [00:43:34].

Moats and Defensibility in AI

DeepL believes that specialized models are significantly more defensible than general models, especially for large, critical use cases like translation [00:36:34]. While general AI models can do “everything,” getting them to perform reliably and seamlessly for specific tasks is complex [00:37:58]. The company has found that a model specifically trained and refined for translation, with good labeled reinforcement learning, outperforms a general model for this task [00:38:18].

Key Takeaways

Overhyped: General models [00:48:46].
Underhyped: Specialized models, where the real value is created [00:48:48].
Biggest Surprise: Beating tech giants like Google [00:49:19].
Lesson Learned: Technology alone is not enough; a full product with commercial strategy is essential for deployment [00:49:43].
Changed Mind On: The power of radical candor and very direct, open, and honest communication in running a company [00:50:18].
Excited AI Startup (outside their space): Perplexity, for taking on Google search with an LLM-based approach [00:51:03].
AI Application to Start (if not DeepL): Medicine, particularly democratizing access to healthcare and drug discovery [00:52:16].
Company Most Interesting to Run AI At: Nvidia, due to the excitement of building hardware [00:53:35].

Tubegraph

Explorer

Table of Contents