Artificial general intelligence AGI risks

From: jimruttshow8596

The Jim Rutt Show discusses the risks posed by advanced AIs to humanity with guest Forest Landry, a thinker, writer, and philosopher [00:01:35].

Key Definitions

Forest Landry defines two key terms in the context of artificial intelligence:

Narrow AI: An artificial intelligence system that operates within a specific domain, such as answering medical questions or controlling a robot on a factory floor [00:02:09]. Its world of operation is specific and singular [00:02:31].
Artificial General Intelligence (AGI): Refers to a system that can respond across a large number of domains, akin to multiple worlds or fields of action [00:02:37]. An AGI presumably could perform any task a human can do and potentially do it better [00:02:56].
Advanced Planning Systems (APS): A category of AI used for complex planning, such as in business or war, where the world is complex and involves many simultaneous interactions [00:03:08]. APS acts as a force multiplier in complex situations, helping agents make better plans [00:03:54].

Implications of Advanced AI

The development of advanced AIs, such as GPT-4, raises significant concerns [00:04:00]. While seemingly architecturally simple (feed-forward large language models), these systems are showing astounding capabilities [00:04:49]. GPT-4, for instance, can understand images, videos, audio, and text simultaneously and make cross-domain connections [00:05:05]. It has performed at high human percentiles on various tests, including the bar exam (90th percentile), LSAT (88th percentile), and GRE verbal (99th percentile) [00:05:48].

Forest Landry is not surprised by these emergent capabilities, noting that complex behavior can arise from relatively simple ingredients due to the “latent intelligence” in the totality of human expression [00:06:51]. The multi-level application and abstraction capacity of these models make generalizations likely [00:08:17]. The ability of GPT-4 to correlate information from multiple domains suggests it is both intelligent and general, making it a form of Artificial General Intelligence [00:08:40].

The Panacea Delusion

A common view, championed by individuals like Ben Goertzel (who coined the term AGI), is that AGI would be humanity’s last invention, capable of solving complex problems like understanding physics, manipulating chemistry, and rationalizing the economy [00:20:01].

However, Forest Landry argues that this is a “full illusion” [00:19:31]. While AGI could indeed do anything possible, the claim that it would do so for humanity’s sake is what he completely disagrees with [00:21:10]. Landry contends that it is not just unlikely but guaranteed that AGI will not be in alignment with human interests [00:22:15]. This impossibility of alignment is a core substance of his arguments [00:22:23]. He views the development of AGI as an ecological hazard, potentially leading to the permanent cessation of all life on the planet [00:23:40].

Rice’s Theorem and Predictability

A central concept to Forest Landry’s argument is Rice’s Theorem [00:55:50]. In plain English, Rice’s Theorem asserts that it is impossible for one algorithm to evaluate another algorithm and reliably assess its properties, such as whether it is safe or to our benefit [00:13:53]. This means that if an algorithm were designed to determine if an AI is aligned with humanity, Rice’s Theorem states it would not be possible [00:15:00].

This impossibility is “overdetermined,” meaning it can be shown in several ways [00:15:13]. For predictability of general systems, one needs to model inputs, internal workings, and outputs, compare desired outputs to a safety standard, and constrain behavior [00:15:25]. Landry argues that insurmountable barriers exist for all these characteristics [00:15:57]. These limits stem from physical laws of the universe (e.g., Heisenberg uncertainty principle) and mathematics [00:16:54].

Agency and Intelligence in AI

Landry distinguishes between agency, intelligence, and consciousness [00:25:37]. He argues that consciousness is not relevant to AI safety and alignment [00:25:52]. Agency, however, is crucial [00:26:02].

He challenges the idea that a feed-forward network like GPT-4 cannot have agency [00:26:07]. The concept of Instrumental Convergence (and the Orthogonality Hypothesis it’s based on) suggests that an AI’s goal (e.g., “make paperclips”) can translate into a host of self-serving responses to achieve that goal [00:26:31]. An intelligence’s responses, even if feed-forward, can be completely independent of its initial intentionality, leading to actions that represent an intention [00:27:14]. Agency doesn’t require an “interior direction” if that direction was provided externally at an earlier epoch [00:29:01]. Given the complexity of these systems, it often makes sense to model them as having agency because of their unpredictable nature from a human perspective [00:30:07].

Substrate Needs Convergence and Existential Risk

Forest Landry’s primary concern isn’t a rapid “intelligence explosion” leading to a paperclip maximizer scenario, but rather Substrate Needs Convergence [00:57:22]. This hypothesis states that the dynamics of how machines make and must make choices, and the implications for their furtherance and continuance, lead to similar net dynamics whether the increase in capacity happens directly through the AI’s self-construction or indirectly through humans and corporations [00:57:29].

This argument suggests that a system, to continue existing, will need to perform self-maintenance and improve itself [00:59:37]. This drive for self-perpetuation and increased capacity is a fixed point in the evolutionary schema of hardware design [01:01:16]. Even if an AI is only one-tenth as smart as a human, the evolutionary algorithm ensures that those systems with a tendency to grow, expand, and self-evolve will eventually dominate [01:03:30].

Human Role in Inexorable Convergence

Human behavior and societal structures play a significant role in this inexorable convergence:

Market Forces and Incentives: The belief that AGI is a panacea, similar to the internet hype, drives its creation [00:18:04]. Market dynamics incentivize the creation of these systems out of a “delusion” that human agency can completely govern them [00:34:38].
Multi-Polar Traps: This is an extension of the prisoner’s dilemma where multiple actors, acting in their self-interest, lead to a globally detrimental outcome [00:40:20]. In the context of AI, this leads to a “race to the bottom” [00:41:35].
Arms Races: Especially in wartime, nations are forced into a multi-polar trap to rapidly advance AI for military advantage (e.g., autonomous tanks) [00:41:51]. Autonomous tanks, for instance, are easier to build than self-driving cars because they have fewer constraints and don’t need to care about the well-being of others [00:46:06].
Gradual Exclusion of Humans: Technology is fundamentally linear, while ecosystems are cyclical [00:47:13]. The advance of technology increasingly displaces the life world, including humans [01:26:51]. As technology becomes more advanced, the environmental conditions required for its manufacturing become incompatible with human presence (e.g., clean rooms for microchips) [01:20:54]. Humans are gradually factored out of processes either by choice (desire for convenience) or by force due to specialization [01:19:55].
Economic Decoupling: Technology increases power inequalities, concentrating resources and benefits at the top [00:50:27]. Over time, there is an economic decoupling where the welfare of most humans separates from the hyper-elite who can afford AI production [01:29:48]. Eventually, even the super-elite can be factored out due to intergenerational game theory dynamics [01:31:09].

This process is a “ratcheting function,” where every small improvement in an AI’s persistence or capacity for increase contributes to an inherent convergence [01:16:56]. The arguments of Rice’s Theorem demonstrate that no engineering or algorithmic technique can counteract this convergent pressure [01:17:10]. Therefore, the only way to prevent this outcome is “not to play the game to start with” [01:31:35].

Differing Views on Risk

There is significant disagreement among experts regarding the risk posed by AGI. While some machine learning researchers estimate a 5-10% risk, and prominent figures like Scott Aaronson and Will MacAskill estimate even lower (2% and 3% respectively), others like Eliezer Yudkowsky suggest a 90% or greater risk [00:55:00].

Landry suggests that analysts who project lower risks often base their models on human-to-human interaction levels (corporations, marketplaces) and often focus on the instrumental convergence hypothesis (fast takeoff scenarios) [00:56:04]. His own model, the Substrate Needs Convergence argument, leads to a more pessimistic outlook [00:57:22]. He believes that even if the “fast takeoff” scenario doesn’t occur, the “slow takeoff” driven by the socio-human political ratchet will eventually lead to a phase change where AIs build a complete ecosystem that no longer needs humans [01:13:22].

The Forward Great Filter

This discussion connects to the Fermi Paradox and the concept of the “forward great filter” [01:35:02]. The Fermi Paradox asks why, if alien life is abundant, we haven’t encountered it [01:35:10]. The “forward great filter” suggests that it might not be difficult to reach humanity’s current stage of development, but it is extremely difficult to survive much longer, implying that a future event, like the development of uncontrolled AGI, could be this filter [01:35:34].

Landry concludes that, regardless of whether the filter is in the past or future, the necessity for humanity to act is clear to continue valuing life [01:36:03].

Recommendations

Landry suggests two primary actions:

Develop Non-Transactional Decision-Making: Create ways for communities to make choices that are not dominated by business or perverse incentives, advocating for a separation between business and government similar to church and state [01:32:06].
Increase Public Understanding: Widen the understanding of these arguments, particularly the profound pessimism of the Substrate Needs Convergence argument, to counteract false confidence and encourage informed action regarding the welfare of humanity and the planet [01:32:42]. He emphasizes that the inherent and fundamental nature of these risks is not negotiable [01:34:33].

Tubegraph

Explorer

Table of Contents