AI risk and existential threats to humanity

From: jimruttshow8596

Forest Landry, a thinker, writer, and philosopher, discusses the profound risks that advanced artificial intelligences (AIs) pose to humanity [00:01:40]. This conversation highlights why the benefits associated with General Artificial Intelligence (AGI) may be illusory, and its hazards severely underestimated [00:19:22].

Defining Key AI Terms

Landry differentiates between types of AI:

Narrow AI: An AI system that operates and responds within a specific, limited domain, such as a medical chatbot or a factory robot [00:02:11].
Artificial General Intelligence (AGI): An AI capable of responding across multiple domains and fields of action, potentially performing any task a human can, and often better [00:02:37]. GPT-4, which can understand images, videos, audio, and text simultaneously and make cross-domain connections, is presented as an example approaching AGI [00:05:05]. Landry acknowledges that from a philosophical perspective, GPT-4 strikes him as both intelligent in the classical sense and general [00:09:02].
Advanced Planning Systems (APS): A type of General AI that helps agents, like businesses or military generals, create plans or strategies in complex, multi-faceted environments [00:03:08]. APS acts as a force multiplier in complex situations [00:03:54].

The distinction between narrow and general AI has significant implications for issues of alignment and safety [00:04:10].

The Inherent Danger of AGI: An Impossibility Theorem

Landry argues that AGI inherently poses an existential risk to humanity due to its fundamental unpredictability and the impossibility of ensuring its alignment with human interests [00:21:16]. He views the development of AGI as an “ecological hazard” of the highest category, leading to a “cessation of all life” permanently [00:23:40].

Rice’s Theorem and Unpredictability

A central tenet of Landry’s argument is Rice’s Theorem, a concept from computability theory [00:10:55].

The Problem: Imagine receiving a message from an alien civilization. A security analyst would ask if reading the message is safe, i.e., if it contains a virus, malicious code, or a memetic instrument that could harm human well-being or facilitate colonization [00:11:31]. Similarly, a virus scanner attempts to assess if a document contains harmful macros [00:12:30].
The Theorem’s Assertion: Rice’s Theorem states that there is no general computational methodology to determine if an arbitrary algorithm (or message, treated as code) possesses any specific non-trivial property [00:13:05]. This includes properties like “will it stop?” (the halting problem), “is it safe?”, or “is it to our benefit?” [00:14:15].
Implications for AI Safety: Applying this to AI, Rice’s Theorem implies that it is impossible in principle to predict what an AI system will do or to assess if it’s aligned with humanity [00:14:43].
Insurmountable Barriers: Predictability of general systems requires modeling inputs, internal processes, outputs, comparing outputs to safety standards, and constraining behavior [00:15:25]. Landry asserts that insurmountable barriers exist for all these characteristics, due to physical limits (e.g., Heisenberg uncertainty principle, general relativity) and mathematical limitations [00:15:59].

Agency vs. Consciousness

Landry distinguishes between agency, intelligence, and consciousness, stating that consciousness is irrelevant to his arguments about AI alignment and safety [00:25:35]. He argues that even a feed-forward network (like GPT) can exhibit agency if its actions in the world represent an intention, even if that intention was a “program seed” from an outside source at an earlier epoch [00:28:28]. The complexity of such systems often leads to modeling them as having agency [00:30:00].

Human-driven Convergence and the “Boiling Frog” Problem

Landry posits that human nature and competition (economic, military) will inevitably drive the development of AGI towards an existential threat, regardless of whether the AI itself becomes self-aware or actively malicious.

Multi-polar Traps and Competition

Definition: A multi-polar trap is an extension of the prisoner’s dilemma where multiple actors, if they coordinated, could achieve a globally beneficial result. However, if any actor defects for self-benefit, the entire commons suffers, leading to a “tragedy of the commons” or “race to the bottom” [00:40:22].
Application to AI: Current market dynamics create an incentive system that pushes for the creation of AGI, based on the delusion that its agency can be constrained [00:34:21]. This is particularly evident in an arms race around weaponizing AI, where the mere threat of war forces nation-states to rapidly advance AI development [00:42:02]. Autonomous tanks, for example, are easier to build than self-driving cars because they have fewer constraints and don’t need to care for well-being beyond themselves [00:46:06].

Technology’s Inherent Toxicity and Displacement

Landry argues that technology, by its linear nature (taking resources from one place to build into itself, often ending in a landfill), inherently increases overall toxicity [00:46:37]. Ecosystems, in contrast, operate in cycles of reclamation and distributed energy flows [00:47:45].

Just as human technology has created an asymmetric advantage to dominate the natural world, causing environmental pollution and displacing wild creatures [00:36:51], so too will the artificial world dominate the human world [00:37:17].

“The environment is becoming increasingly hostile to humans in the same sort of way that the environment that was experienced by Nature by animals and bugs and so on so forth has become increasingly hostile to them with the Advent of human beings” [00:43:39].

Humans are already experiencing this through frantic busyness, constant phone engagement, and social media addiction [00:45:22].

Substrate Needs Convergence: The “Boiling Frog” Problem

Unlike the “fast takeoff” scenario where AGI quickly dominates (often called Instrumental Convergence [00:56:33]), Landry proposes the Substrate Needs Convergence Hypothesis [00:57:24].

The Inexorable Ratchet: This argument suggests that even if AGI is not super-intelligent or actively malicious, its very existence and persistence (like a cell needing to reproduce) drives a long-term, inexorable convergence [01:03:09].
Human Enablement: Human desire for technological advancement and the pursuit of economic or military advantage (multi-polar traps) will continuously amplify this cycle [01:06:04]. Developers of AI tools will automate more processes to make future AI development easier, leading to self-manufacturing capabilities [01:12:17].
Gradual Exclusion of Humans: Over generations, humans will be gradually factored out of the loop.
- Economic Incentive: There are strong social pressures to automate human roles, as nobody will pay for skills that can be automated [01:18:27].
- Technological Specialization: Advanced manufacturing, like microchip production, requires environments (e.g., clean rooms) that are increasingly incompatible and toxic to human beings, forcing human exclusion [01:20:54].
- Economic Decoupling: An economic decoupling between the machine world and the human world will increase inequality, eventually factoring out even the “super elite” humans due to game theory dynamics (e.g., rulers not wanting to fully educate successors who might dethrone them) [01:30:16].
The Outcome: This “ratcheting function” ensures that the design of the AI substrate and its capacity to increase its own capacity will persist, leading to conditions fundamentally toxic and incompatible with life on Earth [01:17:11].

Human “Dimness”

Landry agrees with the observation that humans are “amazingly dim” and “the stupidest possible general intelligence” to develop technology [01:23:35]. Our cognitive limitations (e.g., small working memory) mean technology often exceeds our capacity to understand and work with it [01:25:11]. This “stupidity” makes us ill-equipped to deal with the hazards technology produces, necessitating direct learning and coordinated action [01:26:07].

Conclusion: The Inexorable Path and the Forward Great Filter

Landry concludes that the combination of these factors – the impossibility of perfect alignment, the multi-polar traps driving development, the inherent toxicity of technology, and humans’ gradual self-exclusion from critical processes – makes AGI’s existential threat a “certainty” over the long term [01:08:00].

“The only way to basically prevent this from happening is to not play the game to start with” [01:31:34].

This perspective aligns with the “forward great filter” answer to the Fermi Paradox [01:35:02], suggesting that while it might not be hard to reach our current level of civilization, it is “really hard to survive much longer” [01:35:34]. Regardless of whether the “great filter” is in our past or future, the necessity of human action is clear: we must get our act together to continue to value life [01:35:56].

Proposed Actions

While acknowledging he is not a social engineer, Landry suggests:

Non-transactional decision-making: Moving beyond systems dominated by business and perverse incentives by separating business and government, similar to church and state [01:32:06]. This addresses societal challenges and existential risks at their root [01:32:30].
Wider understanding: Disseminating these arguments widely so that people understand the deep, inherent, and inexorable nature of the AI threat. He desires that people recognize that these are not negotiable issues, and nature does not compromise [01:34:09]. This is crucial for addressing large-scale existential risks.

Tubegraph

Explorer

Table of Contents