AI safety and regulation

Introduction to AI Safety and Regulation Concerns

Eric Reese and Jeremy Howard discuss the landscape of AI safety and alignment and AI policy and regulation, particularly concerning the development and deployment of AI models [00:00:19]. They highlight how current trends in AI development, particularly large investments in models and compute before market interaction, sometimes deviate from traditional Lean Startup principles [00:00:57].

The Dual-Use Nature of AI Models

Jeremy Howard emphasizes that AI models are a “purely kind of dual use technology” [00:40:32], comparable to a pen, paper, calculator, or the internet [00:40:36]. He explains that it’s impossible to ensure the inherent safety of a raw AI model because it can be fine-tuned or prompted to perform any desired function, regardless of its initial safety testing [00:40:55].

Critiques of Current Regulatory Approaches

Howard specifically discusses the proposed California state law, SB147, which aims to place limitations and regulatory checks on training foundation models [00:38:25]. While acknowledging the good intentions and some positive features of such laws, he points out a significant flaw: attempts to ensure model safety by law are likely to be ineffective and could even lead to less safe situations [00:40:01].

“Counterintuitively…not only is it likely such a policy would be uneffective but in fact it would be likely to cause the opposite uh result it would actually be likely to to create a less safe situation” [00:39:58].

Consequences of “Ensuring Safety” through Regulation

According to Howard, regulating models to “ensure safety” in their raw form practically means preventing their release [00:41:36]. Instead, only products built on top of models (like ChatGPT) would be released, where the product’s safety can be controlled, but not the underlying model [00:42:04]. This approach has several negative implications:

Centralization of Power: It makes models a “rival risk good” – a “jealously guarded secret” only available to big states and large companies [00:43:10].
Reduced Transparency: It decreases the ability of independent researchers to study how models work, potentially hindering the development of defensive applications [00:43:57].
Hindered Innovation: It limits widespread access to powerful models that could be used for beneficial applications, such as improving cybersecurity or developing vaccines [00:43:34].

The Role of Open Source and Smaller Models in Safety

Both Howard and Reese advocate for the importance of providing options for “intrinsically safe” applications [00:46:34]. Eric Reese argues that there’s an “unbelievable reservoir of applications that don’t require AGI to unlock” [00:45:14] that are not being built because fundraising gravity pushes entrepreneurs towards “science fiction and speculative stuff” [00:45:28].

He notes that if only frontier models (the closest thing to AGI) are widely available, their deployment into real-world systems risks “a lot of unsafe things to happen” [00:46:10]. Conversely, smaller, properly fine-tuned models are “safer by definition” [00:46:20]. If such options are not provided, people will default to using less safe, larger models [00:46:38].

From the perspective of AI policy and societal trust, they stress that preventing open-source research, fine-tuning, or building applications from small models does not advance the cause of safety [00:46:43].

Internal Organizational Challenges in Foundation Labs

Reese observes that large foundation model labs can become “schizophrenic” [00:47:08], with the original safety-focused AGI mission clashing with the commercial apparatus [00:47:13]. He suggests that these commercial teams often have only a “faintest idea” of the safety agenda or may not care, leading to internal tensions that can undermine coherence and alignment [00:47:18].

Reese advises these labs to reestablish the connection between research and the customer [00:47:43], taking responsibility for customer success in technology deployment and actively seeking feedback from end-users [00:47:52].

Future Breakthroughs and Approaches to AI safety and alignment

A significant breakthrough in AI could be a reduction in the “massive energy requirements” for models, which currently pose an economic and physical obstacle [00:50:34]. Even more transformative would be a breakthrough in “planning and reasoning capability” beyond current “subgraph matching” [00:51:16]. This would move beyond the current auto-regressive model where models pick the next word, allowing for more complex reasoning where important words might appear much later in a sequence [00:51:50].

Eric Reese highlights that the most interesting aspect of LLMs is how they’ve changed perceptions of human intelligence, revealing more of it is encoded in language than previously thought [00:52:36]. He ponders a future breakthrough where it’s realized that current methods of emulating cognition are extremely inefficient brute-force approaches, and a more direct, efficient algorithm in cognition is discovered [00:53:15].

Tubegraph

Explorer

Table of Contents