From: redpointai

DeepL, a company recently valued at $2 billion, supports over 100,000 businesses in AI translation worldwide [00:00:00]. CEO and founder Jarek Kutylowski discussed how DeepL has managed to outperform tech giants like Google in the AI translation space, a market Google has been involved with since the inception of the Transformer paper [00:07:42].

Core Strategy: Specialization and Innovation

DeepL attributes its success to a few key pillars:

Market Focus and Pace [00:08:13]

DeepL’s primary strategy was to enter the market with a “big fuss” and maintain a relentless pace of innovation [00:08:13]. Kutylowski believes that competition is good as it drives the best outcomes for the user [00:01:50]. The presence of major players like Google has instilled a “sense of urgency” within DeepL, which has been a driving force for the company [00:02:16].

Specialized Models over General Models [00:48:46]

A cornerstone of DeepL’s strategy is its focus on specialized models for high-value business translation use cases [00:08:30]. While general models receive more attention, DeepL argues that specialized models are where significant value is created [00:48:54]. This focus allows DeepL to build ready-made solutions that vertically integrate into customer workflows, which businesses prefer over complex internal processes [00:37:21].

Academic Research and Vertical Integration [00:08:25]

DeepL emphasizes building strong academic-level research capabilities within the company [00:08:27]. The company adopts a “build it yourself” approach, owning the entire vertical stack from product and engineering to research, including building its own tooling and data centers [00:10:00]. This allows DeepL to maintain control over model parameters, training, and architecture, enabling them to fine-tune solutions for specific customer problems more effectively than relying solely on prompt engineering [00:10:49]. This tight feedback loop between application deployment and model refinement leads to a more tailored product [00:11:15]. An example of this is the ability for DeepL’s customers to embed specific terminology into models, a crucial feature for businesses that other translation providers have struggled to integrate effectively [00:11:36].

Geographical Advantage [00:08:46]

Being established in Europe, a region with many different languages in close proximity, has given DeepL’s team a deep understanding of translation problems and a strong motivation to solve them [00:08:48].

Challenges and Solutions in AI Translation

Data Size Variation [00:13:30]

The availability of translated material differs significantly across language pairs (e.g., German-English vs. Polish-English) [00:13:01]. DeepL addresses this by running different sets of models, varying in size based on the available data [00:12:47]. Smaller models are sometimes used for individual language pairs, especially when optimizing for inference compute [00:14:04].

The Role of Human Translators and Data Labeling [00:14:48]

Human data and translators play an increasingly important role in DeepL’s development [00:14:48]. DeepL has run internal data annotation projects for years, utilizing human translators to train models and ensure quality assurance [00:15:04]. This in-house approach, requiring native speakers from across the globe, is deemed crucial for the high quality expected of specialized models [00:15:31]. While considering outsourcing parts of this, DeepL emphasizes the importance of control over the process and the quality of data labeling [00:17:10].

Infrastructure Decisions: Own Data Centers vs. Hyperscalers [00:26:43]

DeepL chose to operate its own data centers from the beginning due to the lack of alternatives at the time [00:27:04]. This decision now provides cost advantages and ensures access to the newest hardware for faster market deployment [00:29:20]. While hyperscalers are great for kickstarting operations, DeepL advises considering owning data centers for significant scale due to cost efficiency and hardware availability [00:27:01]. However, running internal infrastructure is more complex and incurs development speed costs, leading DeepL to move large parts of its stack to hybrid cloud solutions, retaining on-premise infrastructure only for efficiency, security, or data protection requirements [00:29:49].

The Future of AI in Language

Synchronous Speech Translation [00:39:11]

Kutylowski sees spoken language and voice as the “next frontier” in translation [00:39:14]. While text translation has significantly changed how content is consumed, synchronous speech translation will transform conversations [00:39:34]. This technology could enable seamless communication in businesses across continents, allowing employees to access information and education more easily regardless of location [00:40:48].

The main challenges for synchronous speech models include latency, ambiguity in spoken language, and the unstructured nature of conversations [00:46:01]. DeepL expects early products in a few years, but perfecting it will take time, similar to text translation [00:45:28].

Impact on Language Learning [00:47:05]

AI in language learning holds promise for democratizing access to language practice, making it less expensive than traditional in-person teachers [00:47:25]. While AI models can facilitate fluent conversations, the intrinsic human desire for cultural connection and intellectual challenge suggests that people will continue to learn languages, perhaps more for personal interest than business necessity [00:43:56]. Kutylowski speculates that the average person might learn fewer languages in the future, but those who do will find more enjoyment in it [00:43:34].

Current State of AI Translation [00:30:30]

The quality of AI translation still varies significantly by language pair, with more widely used languages generally performing better due to greater training data availability [00:30:46]. While one-to-one communication like emails is largely “solved” for well-resourced languages, high-stakes contexts (e.g., marketing websites for billion-dollar companies, nuclear power plant manuals) still require human oversight and accountability [00:31:50].

Learnings and Reflections

  • Beating Tech Giants: Kutylowski admits that beating tech giants like Google was a significant surprise, emphasizing the importance of pace and delivery [00:49:19].
  • Beyond Technology: He learned that technology alone is not enough; a successful company needs to build a complete product with commercial considerations to effectively deploy technology to users [00:49:43].
  • Innovation Mindset: DeepL’s philosophy embraces the idea that innovation means throwing away many results, including failed experiments and superseded solutions. Each failed attempt offers a deeper understanding of the problem, leading to the next level of development [00:32:32].
  • Radical Candor: Kutylowski highlights the power of “radical candor” – direct, open, and honest communication within the company – as a crucial cultural aspect that he initially underestimated [00:50:18].

Moats and Defensibility in AI [00:35:40]

For DeepL, the moat lies in its specialized models compared to very general ones [00:36:32]. In specific, large-scale use cases like translation, specialized models make significant sense and allow for the creation of vertically integrated, ready-made solutions that customers prefer [00:37:14]. While general AI can do many things, it is complicated to deploy reliably as an SDK [00:38:00]. DeepL’s experience suggests that models specifically trained and reinforced with good labeled data for translation outperform general models [00:38:16].

The company continues to optimize its models, focusing on smarter architectures to achieve more with less compute, as opposed to just brute-force compute investments [00:25:06].