Open source vs closed source large language models

From: redpointai

Mistral AI, co-founded by Arthur Mensch, is positioned at the epicenter of the AI landscape, building one of the leading open source LLMs [00:00:10]. This article explores the evolving dynamic between open source and closed source models as discussed by Arthur Mensch.

The Evolving AI Landscape

The current AI landscape is characterized by a divide between closed-source offerings, such as those from OpenAI, Anthropic, and Google, and the open-source side, which includes Meta and Mistral [00:03:11]. Arthur Mensch believes this landscape will solidify with open source prevailing [00:03:31]. He views AI as an infrastructure technology that should be modifiable and owned by customers [00:03:36].

Mistral’s Dual Offering

Mistral currently offers both an open source and a commercial offering, a strategy that may evolve as they seek to establish a sustainable business model for open-source development [00:03:45]. The goal is to be the most relevant platform for developers [00:05:29].

Mistral’s decision on what goes into closed-source versus open-source is tactical and subject to competitive and commercial pressures [00:05:17]. Generally, their very best models are shipped with commercial licenses, while those just below are open source [00:05:05].

Challenges for Open Source Adoption

While open-source offerings are gaining traction, there have been slight gaps in performance and usability compared to closed-source options [00:04:04]. Closed-source offerings previously had better software surrounding APIs [00:04:14]. Mistral is actively working to close these gaps [00:04:06].

Licensing Weights for Enterprises

Mistral provides enterprises with access to model weights for its commercial offerings, such as Mistral Large [00:04:56]. This approach offers several benefits:

Deployment Flexibility Enterprises can deploy models where their data resides, addressing data governance concerns [00:06:00].
Specialization Access to weights allows enterprises to perform the necessary specialization and connect models to their specific data and tools, enabling more involved applications than simple API usage [00:06:11].

This contrasts with other providers who typically charge via API access [00:05:43]. Small companies and digital natives often prefer to go directly to Mistral’s platform for direct support [00:09:36]. Larger enterprises, particularly in Europe, often prefer to use existing procurement processes via partners like Azure [00:09:51].

Competition and Efficiency in LLM Development

The competition in building and utilizing large language models is intense. While some companies like Meta have vast GPU resources (e.g., 600,000 GPUs), Mistral focuses on efficiency and a high concentration of GPUs per person [00:12:01]. This lean approach allows them to be efficient and innovative in training models, having achieved significant results with 1.5K H100 GPUs [00:12:17].

Arthur emphasizes the importance of unit economics, ensuring that compute spending translates to revenue [00:12:46]. Being efficient with training compute is key to a valid business model [00:13:02].

Future of Language Models

Arthur believes there’s still an “efficiency frontier” to push in LLMs [00:10:25]. Improvements will come from:

Architectural Innovations: Making models more efficient than plain Transformers, which spend the same compute on every token [00:11:09].
Controllability: Significant research is still needed to make models more controllable and follow instructions precisely [00:10:44].
Deployment: Deploying models on smaller devices and improving latency will open up new applications that use LLMs as basic building blocks for complex tasks like planning and exploration [00:11:27].

While new architectures are proposed, the co-adaptation of training algorithms, debugging methods, and hardware to Transformers makes it challenging to switch to entirely new architectures [00:14:50]. Mistral’s innovations focus on improving sparse attention for memory efficiency within the Transformer framework [00:15:37].

Regulation and Sovereignty

Regarding regulation, Mistral advocates for addressing AI safety from a product safety perspective, similar to general software safety [00:17:17]. They believe the focus should be on evaluating the product’s expected performance, rather than regulating the underlying technology based on arbitrary thresholds [00:17:25].

Arthur suggests that policymakers should pressure application makers to verify that their AI products perform as expected, similar to car safety testing [00:21:17]. This would create a “second-order pressure” on foundational model makers to provide controllable models [00:21:34]. Regulating the technology directly, as seen in parts of the EU AI Act, can favor large players who can deploy legal teams to influence standards [00:22:45].

The rise of foundation models for different countries (e.g., India, Japan) highlights the importance of:

Portability: Enabling countries and developers to deploy technology where they want is Mistral’s approach to sovereignty [00:23:23].
Multilinguality: Since LLMs speak languages, models need to be great in every language, not just English [00:23:31]. Mistral started with French and aims for global multilingual capability [00:23:55].

Arthur believes that if companies offer a “platform play” by shipping portable models to various countries, it should be enough for countries to feel confident they control the technology [00:25:28]. However, if only a few companies offer Software-as-a-Service (SaaS) exclusively, a sovereignty problem could arise [00:25:41].

Future Directions for LLMs

Mistral is also exploring how to help enterprises with data for model specialization. While large datasets are useful, Arthur notes that many enterprises lack readily available “demonstration data” (traces of user actions) needed for robust fine-tuning [00:29:31]. He suggests that using retrieval augmentation generation (RAG) and empowering assistants with tools and database access should be the first steps for enterprises with large data volumes [00:29:08]. This suggests a more even playing field for companies looking to leverage AI, as they will need to acquire new types of data [00:29:58].

Arthur expressed excitement for “hard science” applications of AI, specifically mentioning Material Science, which he believes still lacks a foundational model [00:32:36]. He sees immense potential in accelerating processes like ammonium synthesis, a carbon-intensive process, through AI-driven exploration [00:32:49].

Tubegraph

Explorer

Table of Contents