From: allin
The debate between open-source and closed-source approaches is a central theme in AI development, with implications for innovation, cost, and geopolitical competition [00:18:17].
Deepseek R1: A Case Study
The release of Deepseek’s R1 language model, which is comparable to OpenAI’s 01 model, brought this debate to the forefront [00:16:02]. Deepseek claimed to have developed R1 for 800 million cost for GPT-4 training and a projected $1 billion for GPT-5 [00:16:03]. Deepseek further intensified the debate by open-sourcing R1 and offering API access at a fraction of the cost [00:21:06].
This development legitimately surprised many in the industry, accelerating perceptions of how close China is to the US in AI model development, from 6-12 months behind to potentially 3-6 months [00:21:32].
Cost Claims and Compute Resources
The claim of developing R1 for $6 million has been largely debunked [00:22:12]. This figure likely refers only to the final training run, not the total R&D and hardware costs [00:23:30]. Experts estimate Deepseek, including its associated hedge fund, possesses a compute cluster of over 50,000 Hopper GPUs (10,000 H100s, 10,000 H800s, and 30,000 H20s), a setup costing over a billion dollars [00:24:31].
Innovation Driven by Constraint
Despite the disputed cost, Deepseek’s technical innovations are notable [00:26:51]. They developed a new reinforcement learning algorithm (GRPO) that uses less computer memory and is highly performant, differing from the conventional PPO algorithm [00:27:15]. They also worked around Nvidia’s proprietary CUDA language by using PTX, going directly to the bare metal [00:28:01]. These innovations suggest that constraints, possibly related to GPU access or memory, can drive unique solutions that might not emerge in environments with abundant compute resources [00:28:22].
Distillation Accusations
A significant part of the controversy involves accusations that Deepseek’s R1 model was “distilled” from OpenAI’s models [00:30:57]. Distillation is a process where a smaller model learns from a larger, more powerful model by asking it questions and refining its own responses [00:31:13]. Evidence, such as Deepseek’s V3 model self-identifying as Chat GPT-4, suggested substantial training on OpenAI output [00:35:08]. This could occur either by crawling publicly available OpenAI output or by heavily using OpenAI’s API [00:35:41]. OpenAI has stated they found evidence that Deepseek used their proprietary models for training [00:36:19].
The situation is further complicated by Microsoft, a key partner of OpenAI, hosting Deepseek’s R1 model on Azure [00:32:00]. Critics argue this shows a lack of loyalty from Microsoft to OpenAI by facilitating the use of a potentially “stolen” or distilled model that undercuts their partner [00:33:00].
Arguments for Open Source
Proponents of open source AI argue that it promotes innovation, reduces costs, and benefits humanity [00:40:04]. The idea is that if foundational models become commoditized and cheaper, the value shifts to the application layer, similar to how YouTube was built on storage or Uber on GPS [00:43:40]. This would lead to a “thousand flowers blooming” effect with widespread innovation [01:05:24].
Additionally, as the cost of AI decreases, demand and usage are expected to increase significantly (Jevons Paradox), leading to more applications becoming economically feasible [00:46:28]. Meta is seen as a crucial player in the open-source space, expected to continue embracing and extending these developments to foster developer ecosystems and applications [00:41:13].
Arguments for Closed Source / Concerns about Open Source
Companies like OpenAI, which initially aimed for open source but shifted to a closed-source model, face scrutiny [00:40:04]. The argument for closed-source models often centers on the need to protect intellectual property and ensure a return on the substantial capital invested in developing frontier models [00:47:14].
Concerns about open source also include:
- IP Theft: The risk of other entities using proprietary models and data without permission for their own development [00:36:16].
- Geopolitical Strategy: From a US perspective, Chinese companies open-sourcing models could be seen as a strategic move to catch up and undercut leading American companies [00:48:52].
- Safety and Control: The proliferation of powerful AI models through open source raises questions about control and potential misuse, leading to calls for stricter regulation like KYC requirements for model users [00:37:26].
Industry Implications
The ongoing debate highlights several challenges and opportunities in AI development and deployment:
- Value Chain Shift: If models become commoditized, the value in AI development will shift from building foundational models to creating specialized applications and services built on top of them [00:43:42]. This requires companies to build “shims” or abstraction layers to easily swap out underlying models [00:42:43].
- Hardware and Data Moats: While model performance may depreciate quickly, competitive advantages could still arise from controlling unique datasets (e.g., Tesla’s driving data, Google’s YouTube content) or from innovations in hardware and manufacturing [01:08:56].
- Overcapitalization Risks: Excessive capital in AI development might lead to a lack of innovation driven by constraint, making companies “too soft” or “too bureaucratic” [01:04:48].
- Electricity and Infrastructure: The increasing demand for AI, especially in areas like autonomous vehicles, will place immense pressure on electricity grids and require massive investment in power generation and infrastructure to support these technologies [01:28:15].
The future of AI development will likely involve a dynamic interplay between open-source collaboration and proprietary innovations, with market forces and geopolitical considerations shaping the landscape [00:49:39].