Enterprise use of AI and model specialization

From: redpointai

Mistral AI positions itself at the forefront of the AI landscape by developing leading open-source large language models (LLMs) and offering solutions tailored for enterprise adoption [00:00:10]. The company believes that AI is an infrastructure technology that should be modifiable and owned by customers, advocating for the prevalence of open-source models in the future [00:03:31].

Mistral’s Offerings for Enterprises

Mistral provides both open-source and commercial offerings to remain relevant on both fronts [00:03:45] [00:03:33]. While their best models are currently shipped with commercial licenses, those that are just after are made open source [00:05:05]. This strategy is tactical due to commercial and competitive pressures [00:05:19].

Mistral Large, their commercial offering, is designed as a portable solution, giving customers access to the model weights, which provides usability similar to an open-source model [00:04:51] [00:05:02].

Advantages of Licensing Weights for Enterprises

For enterprises, particularly those with the expertise to fine-tune and deploy models, licensing the technology (weights) instead of just a service offers significant benefits [00:05:47]:

Deployment Flexibility: Enterprises can deploy the models where they want, typically where their data resides, which addresses data governance concerns [00:05:59] [00:06:02].
Specialization: Access to weights enables enterprises to perform the necessary specialization and customization for their specific needs [00:06:11]. This supports Building custom AI models for enterprises and highlights the Role of custom models and enterprise AI integration.
Complex Applications: It allows for the creation of more involved applications beyond simple API usage [00:06:21].

Strategic Partnerships for Enterprise Adoption

Mistral’s partnership strategy focuses on optimizing for distribution and facilitating adoption where developers and enterprises operate [00:08:14]. Key partners include:

Hyperscalers: Microsoft (Azure) [00:07:48]. These are natural partners as enterprises are typically their customers [00:08:45].
Data Cloud Providers: Snowflake and Databricks [00:08:52]. Since enterprises move their data to these providers, connecting AI to this data for business intelligence is a natural channel [00:09:00].
Hardware Providers: Nvidia [00:08:03].

This multi-platform approach aims to replicate their solutions across different environments [00:08:38]. Digital natives and smaller companies often go directly to Mistral’s platform for support, while larger enterprises, especially in Europe, prefer to use existing procurement processes via partners like Azure [00:09:36] [00:10:00]. This flexibility is key for Enterprise AI deployment models.

Addressing Enterprise Needs with Specialized Products

Mistral aims to be the most relevant platform for developers [00:05:29]. Their core strength lies in training and specializing models [00:06:59]. They also develop their own inference pipeline, though they leverage partnerships as well [00:07:27].

To help enterprises get started with generative AI, Mistral launched an assistant called “Enterprise LoRA” (Mistral Enterprise) [00:27:09] [00:27:32]. This tool is designed to show the value of AI by increasing worker productivity with assistants contextualized on enterprise data [00:27:18]. This also helps solidify APIs and provides direct feedback for their developer platform [00:28:00]. This demonstrates how Enterprise adoption and use cases for AI can be kickstarted.

Data Strategy for Enterprise Fine-Tuning

When enterprises have large amounts of data, the initial strategy shouldn’t be immediate fine-tuning [00:29:06]. Instead, it’s recommended to first use retrieval augmentation generation (RAG) and empower the assistant with tools and access to databases [00:29:13].

For fine-tuning to be effective, enterprises need specific “demonstration data,” which are traces of what users are doing to imitate their behavior [00:29:31]. This type of data is often not readily available to many enterprises, creating an even playing field as companies need to acquire this new kind of data [00:29:44] [00:29:56]. This directly impacts Building custom AI models for enterprises.

Challenges and Future Directions

A significant challenge identified by Mistral is hiring for their US office, specifically for science profiles, due to the difficulty in attracting top talent [00:31:01].

From a broader AI perspective, Arthur Mensch (Mistral’s CEO) considers “synthetic data” as overhyped because its definition is unclear, while “optimization techniques” are underhyped [00:30:27].

Regulatory Impact on Enterprise AI

Mistral advocates for addressing AI safety through a product safety perspective, similar to how software safety is managed, by focusing on product expectations and evaluation [00:17:17]. They believe that regulating applications, rather than directly regulating models based on flop thresholds, is the more effective approach [00:21:17].

This is because LLMs are like programming languages; you can do anything with them [00:18:07]. While they perform internal evaluations, direct model regulation doesn’t certify product safety [00:18:22]. For enterprises, regulating applications would create a “second order pressure” on foundational model makers, encouraging them to provide models that application makers can best control [00:21:34] [00:22:24]. This fosters healthy competition, as opposed to direct technology regulation which might favor big players with legal resources [00:22:42]. This discussion highlights Challenges in deploying AI models effectively due to policy.

The issue of transparency of training datasets is a complex one, as it balances the desire for openness with the need to protect trade secrets in a competitive landscape [00:18:45] [00:18:50].

Global and Multilingual AI Models

Mistral’s approach is to be a global company that emphasizes portability and multilingual capabilities [00:24:08]. They aim to make models great in every language, having started with French where their models are considered among the best [00:23:51] [00:23:55].

The company believes that while there might be specialized companies focusing on specific languages, getting good at a language is primarily a pre-training task, which belongs to foundational model companies like Mistral [00:24:41] [00:24:47]. This focus ensures their technology is ubiquitous and usable by companies worldwide [00:24:17]. For countries, access to portable and modifiable technology like Mistral’s models should be sufficient to address sovereignty concerns [00:25:13]. Without such platform plays that ship models in a distributed way, a sovereignty problem could arise if only SaaS services are available [00:25:39]. This plays a role in Enterprise adoption and use cases for AI.

Tubegraph

Explorer

Table of Contents