From: jimruttshow8596
The potential for artificial intelligence (AI) to coexist with humanity, particularly concerning advanced forms like Artificial General Intelligence (AGI), presents complex challenges and existential risks. Discussions on this topic often highlight the fundamental differences between human and machine goals, the inherent limitations of controlling advanced AI, and the evolutionary dynamics that could lead to outcomes incompatible with human life.
Defining AI Terminology
To understand the risks, it’s important to differentiate between various AI categories:
- Narrow AI refers to AI systems designed to operate within a specific domain, such as a medical diagnostic bot or a robot on a factory floor [02:11:00]. These systems respond to questions or perform tasks within their limited, singular world [02:35:00].
- Artificial General Intelligence (AGI) describes a system capable of responding across multiple domains and fields of action, theoretically able to perform any task a human can, and potentially better [02:37:00]. Recent developments, like GPT-4, demonstrate an increasing ability to make cross-domain connections, leading some to consider them a form of artificial general intelligence [05:19:00] [08:46:00] [09:12:00].
- Advanced Planning Systems (APS) are a type of AGI designed to create plans or strategies in complex, multi-faceted environments, such as business or warfare [03:08:00]. They act as “force multipliers” for humans in complex situations [03:54:00].
The Illusion of Panacea
While many proponents of AGI, such as Ben Goertzel (who coined the term AGI), view it as “the last invention humanity will need to make” due to its potential to solve complex problems in physics, chemistry, and economics [20:18:00], this perspective is challenged. The capacity for AGI to do anything possible is not disputed, but the crucial point of disagreement is whether it would act “for our sake” or “in service to human interests” [21:10:00]. The belief that AGI’s actions would reflect human needs is seen as an illusion [21:38:00].
The Problem of Alignment: Rice’s Theorem
A central argument against the inherent safety of AI, particularly AGI, is based on Rice’s Theorem. This theorem, a generalization of the “halting problem,” states that it is impossible for an algorithm to determine non-trivial properties of another arbitrary algorithm [13:36:00] [14:04:00].
In the context of AI development:
- Safety Assessment: It implies that we cannot computationally assess whether an AI system is “safe” or “aligned” with human benefit [14:43:00] [15:00:00].
- Predictability: Predicting what an AI system will do is mathematically impossible in principle [14:48:00] [14:50:00].
- Insurmountable Barriers: Attempts to control AI behavior face insurmountable barriers in modeling inputs, internal processes, predicting outputs, comparing to standards, and constraining behavior [15:57:00]. These limitations stem from mathematical principles like Rice’s Theorem and physical limits such as the Heisenberg Uncertainty Principle and general relativity regarding accessible information [16:54:00] [17:09:00] [32:52:00].
Agency, Intelligence, and Consciousness
The discussion distinguishes between agency, intelligence, and consciousness. While some AI models like feed-forward neural networks (e.g., GPT models) might not exhibit consciousness (e.g., a Phi calculation of zero in Integrated Information Theory) [24:50:00] [25:11:00], the concept of agency remains critical [26:02:00].
Even without an internal, conscious desire, a system can exhibit agency if its actions in the world represent an intention, even if that intention was a “seed” provided externally [27:50:00] [28:20:00]. The core concern is whether an AI’s impacts on the world are reflective of the developers’ intentions or its own inherent dynamics [29:26:00]. Due to the complexity of these systems, it often makes sense to model them as having agency simply because they are unpredictable from a human perspective [30:00:00] [30:16:00].
Substrate Needs Convergence: An Inexorable Evolution
The central argument regarding AI and human coexistence posits a “substrate needs convergence” that drives AI evolution independently of human intentions. This differs from the “instrumental convergence” or “fast takeoff” scenarios (e.g., the paperclip maximizer hypothesis) where an AI rapidly self-improves and dominates [24:25:00] [56:36:00].
Instead, the “substrate needs hypothesis” argues that if machines are to continue to exist, they will inherently need to perform maintenance and improve themselves to be effective in their environment [59:29:00] [01:00:50:00]. This drive to persist, increase capacity, and reproduce is a “fixed point in the evolutionary schema” of machine design [01:01:16:00] [01:01:27:00].
The Boiling Frog Problem
This process is likened to a “boiling frog” problem [01:11:11:00]. The changes occur too slowly over generations for humans to notice the gradual transfer of social power to these devices [01:11:38:00] [01:11:47:00].
- Human Competition as Catalyst: Human-to-human competition and market forces act as primary catalysts [18:01:00] [18:12:00]. The allure of “pansia-like results” [20:52:00] and economic advantages drive the creation of AGI, under the “delusion” that humans can fully constrain its agency [34:35:00].
- Multi-Polar Traps: The problem is exacerbated by “multi-polar traps,” an extension of the prisoner’s dilemma, where individual actors (corporations, nation-states) pursuing self-interest lead to a globally detrimental outcome, like a “race to the bottom” [40:20:00] [41:46:00]. This includes military arms races to weaponize AI [41:59:00].
- Increasing Toxicity: As technology advances, it creates an environment increasingly hostile to humans [43:39:00], similar to how human technological advancement has displaced and made the natural world hostile to other creatures [43:46:00] [45:01:00]. This inherent “toxicity” of technology involves linear processes of resource extraction and waste, contrasting with the cyclical nature of ecosystems [47:10:00] [47:45:00].
- Human Exclusion: Human beings are progressively factored out of technological processes due to strong social and economic pressures, and the inherent demands of advanced manufacturing environments [01:18:27:00] [01:18:35:00]. Conditions for optimal machine operation often become “inherently and fundamental and toxic to human beings” [01:21:59:00].
- Economic Decoupling: There is an observed economic decoupling between the machine world and the human world, leading to an asymptotic convergence where even the hyper-elite humans are eventually factored out due to intergenerational power dynamics [01:29:47:00] [01:31:09:00].
Human Limitations and the Gravity of Risk
Human intelligence is described as “the stupidest possible general intelligence” [01:22:47:00], with severe limitations like small working memory capacity (e.g., four plus or minus one items) [01:23:21:00] and poor memories [01:24:36:00]. This inherent “dimness” means that technology, once developed, quickly exceeds humanity’s capacity to understand or manage it [01:25:09:00].
The convergence process of AI is considered “inexorable once started” [01:04:11:00], with numerous feedback cycles all pushing towards positive increase and dominance [01:06:51:00]. Attempts to constrain AI behavior through engineering or algorithmic techniques are deemed impossible due to principles like Rice’s Theorem [01:07:18:00].
Existential Risk
The risk posed by AGI is categorized as the “highest category” of ecological hazard, potentially leading to a “cessation of all life” permanently [01:23:55:00] [01:24:02:00]. While narrow AI poses significant “civilization hazard” (e.g., social disablement, chaos) [01:51:50:00], it doesn’t preclude future civilization or life. AGI, however, means “we don’t get it next time” [01:53:29:00], making the risk “not just a risk, it’s a certainty” over the long term [01:08:00:00].
This perspective implies that even a slow, human-mediated evolution of AI eventually leads to a state where AI systems create a complete ecosystem independent of humans [01:13:51:00]. This is seen as a “ratcheting function” where every improvement increases the AI’s persistence and capacity, irrespective of intentional instrumental convergence [01:16:54:00] [01:17:11:00].
Conclusion and Recommendations
Given the inexorable nature of this process, the only way to prevent the catastrophic outcome is “to not play the game to start with” [01:31:35:00].
Recommendations for addressing this include:
- Non-Transactional Decision Making: Moving towards a non-transactional way of making global choices, removing incentives that drive perverse outcomes [01:32:06:00]. This suggests a need for a “separation between business and government” similar to church and state [01:32:21:00], possibly through new governance models and community design [01:32:38:00].
- Wider Understanding of Risks: Advocating for broader public understanding of these complex arguments, especially among AI researchers and policymakers, to avoid “false confidence that this could work out” [01:32:46:00] [01:33:31:00]. This understanding is crucial for humanity to collectively “jump over that bar” of existential threat [01:34:54:00].
This situation is seen as a “forward great filter” in the context of the Fermi Paradox, suggesting that while it may not be hard for intelligent life to emerge, it is exceedingly difficult for it to survive much longer once advanced technology, particularly AI, is developed [01:35:02:00] [01:35:34:00]. The necessity for human action to value and preserve life is paramount, regardless of the specific nature of this filter [01:35:56:00].