From: redpointai
Mistral, a company at the epicenter of the AI Zeitgeist, focuses on building leading open-source large language models (LLMs) [00:00:10]. Their CEO and co-founder, Arthur Mensch, discusses Mistral’s strategy for enterprise adoption and the future of AI application deployment [00:03:09].
Mistral’s Dual Offering and Enterprise Focus
Mistral offers both open-source and commercial LLMs [00:03:45]. The company believes that AI, as an infrastructure technology, should be modifiable and owned by customers, making open-source models a priority [00:03:36]. Their aim is to close the performance and usability gaps that currently exist between open-source and closed-source offerings, including better software surroundings and functioning APIs [00:04:04].
Mistral’s best models are shipped with commercial licenses, while models just below that performance tier are released as open-source [00:05:05]. This strategy is tactical and subject to change due to commercial and competitive pressures, but their core mission remains to be the most relevant platform for developers [00:05:17].
Licensing and Deployment Flexibility for Enterprises
Mistral’s approach to enterprise AI adoption involves licensing their technology (weights) rather than just providing API access [00:05:53]. This allows enterprises to:
- Deploy models where they want, typically where their data resides, solving data governance issues [00:06:00].
- Perform necessary specialization and fine-tuning [00:06:11].
- Build more involved applications [00:06:21].
While Mistral’s core strength lies in training and specializing models [00:06:59], they also recognize the need to run inference effectively. They have been building their own inference pipeline while also leveraging partnerships [00:07:13].
Strategic Partnerships for Distribution and Adoption
Mistral has formed significant partnerships with companies like Microsoft, Snowflake, Databricks, and Nvidia [00:08:00]. The strategy behind these partnerships is to optimize for distribution by operating where enterprises and developers already exist [00:08:06].
- Hyperscalers (e.g., Microsoft Azure): These are natural partners because enterprises are often their customers. This allows developers using Azure to easily find and use Mistral models, facilitating adoption [00:08:21].
- Data Cloud Providers (e.g., Snowflake, Databricks): Enterprises have moved their data to these providers, and AI becomes very useful when connected to this data for purposes like business intelligence [00:08:52]. This offers a natural channel for AI integration [00:09:11].
For smaller companies and digital natives, direct engagement with Mistral’s platform is common, providing direct support [00:09:36]. However, larger enterprises, especially European ones, often prefer to use existing procurement processes through partners like Azure, utilizing their Azure credits to access Mistral’s technology [00:09:51]. This multiplatform approach aims to replicate their solution across different environments [00:08:38].
Challenges and Strategies in AI Application Development
Mensch highlights that a key challenge in AI application development is making models controllable [00:10:47]. Much research is still needed in this direction, particularly for tweaking models to follow instructions precisely [00:11:00].
L’Assistant Entreprise as an Entry Point
Mistral developed L’Assistant Entreprise (pronounced “L’Assistant Enterprise”) to help enterprises get started with generative AI [00:27:09]. This assistant is designed to contextualize on enterprise data and increase worker productivity [00:27:24]. It also serves to solidify the APIs behind it, such as moderation tools, which Mistral intends to expose on their platform [00:27:45]. This product helps enterprises engage with generative AI before they fully understand its applications for their core business [00:28:09].
Data for Fine-tuning and Retrieval Augmentation
For enterprises with large amounts of data, fine-tuning directly is not always the first step [00:29:08]. Instead, Arthur Mensch suggests:
- Retrieval Augmented Generation (RAG): Using RAG to empower assistants with tools, data, and the ability to make requests to databases [00:29:13].
- Demonstration Data: For fine-tuning, data consisting of user traces (what users are doing) is more valuable for imitating behavior and providing robust assistants [00:29:31]. This type of data is often missing in enterprises, creating an even field for companies to acquire it and rethink their data strategy in light of deploying copilots and assistants [00:29:54].
Regulation and Product Safety
Mistral advocates for regulating AI products based on safety, similar to how software safety is addressed [00:17:17]. Focusing on the product’s expected behavior and evaluation is key [00:17:25]. While current regulations, like the EU AI Act, introduce technology-specific requirements (e.g., evaluation, red-teaming), Mensch believes they don’t solve the fundamental product safety problem [00:17:50].
[00:18:07] “LLMs are like just like a coding language, it’s program ma language and so you can do whatever you want with it… At the end of the day it’s even if you have access to these evaluation this is not going to certify that the products that are made with it are actually safe.”
Arthur believes that the onus should be on application makers to verify that their AI-powered tasks are well-solved [00:21:17]. This creates a “second-order pressure” on foundational model makers to provide models that are easier for application makers to control and evaluate for specific tasks [00:21:31].
[00:22:24] “The application makers are going to pick the model that they can best control.”
This approach fosters healthy competition by incentivizing model makers to improve controllability [00:22:31]. Regulating the technology directly, however, can favor larger players due to their ability to manage regulatory burdens [00:22:42].
Future of LLMs and Mistral’s Role
Mistral remains committed to pushing the efficiency frontier for LLMs, aiming for more compressed models and improving control [00:10:25]. They also anticipate improvements in architectures beyond plain Transformers for greater efficiency [00:11:09]. By improving efficiency and reducing latency, new applications that rely on LLMs as basic building blocks for planning and exploration will emerge [00:11:34].
Mistral’s approach to global enterprise adoption is to be a multi-platform, global, and multilingual company [00:24:08]. This involves making models great in every language, starting with French, and ensuring portability to allow countries and developers to deploy the technology wherever they need [00:23:51]. This strategy addresses potential sovereignty concerns by providing access to controllable technology rather than just Software-as-a-Service (SaaS) offerings [00:25:28].
Arthur Mensch believes that the optimal technological solution is to have a few global model providers who make their technology portable and modifiable for local deployment, addressing political concerns about sovereignty [00:25:02].
The company’s rapid success, including quick market attention and strong inbound interest, has confirmed their confidence in shipping good models in a limited timeframe [00:26:45]. Mistral continues to focus on training and specializing models, offering customization abilities, and building out their inference capabilities [00:07:01]. While they are still flexible on what parts of the broader developer platform they will build versus partner for, their core strength lies in model development and fine-tuning [00:36:36].