From: jimruttshow8596

Artificial Intelligence (AI) serves as a crucial link between philosophy and mathematics, aiming to automate and scale the execution of processes that enable language to refer to meaning [00:01:34]. This endeavor allows for the grounding of meaningful language in a mechanical, understandable universe [00:01:55]. For Joshua Bach, Vice President of Research at the AI Foundation, AI represents an attempt to build a testable theory of what a system without “conspiracy inside it” might be [00:02:22]. If successful, this theory would create systems that are “like us,” capable of reflecting on their environment and understanding their own nature [00:02:42]. Bach views the project of artificial intelligence as a capstone of philosophical traditions addressing fundamental questions about human existence and the nature of the observer in the universe [00:02:55].

Evolution of AI Research

When AI emerged as a field, it encompassed various disciplines, including cybernetics, computer science, information theory, and psychology [00:04:14]. Early pioneers were highly optimistic, believing that teaching computers to think would be a quick endeavor [00:04:47]. Tremendous progress was indeed made, such as computers playing chess and performing simple language understanding and planning [00:04:58]. Many programming languages used today trace their principles to the first few decades of AI research [00:05:21]. AI often served as the “pioneer battalion” of computer science; once a problem was solved, it became “boring” and integrated into other fields [00:05:46].

However, the initial optimism for building a thinking machine in a short timeframe faded due to the daunting nature of the task [00:05:51]. Researchers began focusing on more achievable goals that yielded results within grant cycles or career durations [00:06:05]. This led to AI becoming more applied and narrow, focusing on automating statistics and developing related mathematics [00:06:16]. The philosophical ambition of AI, which included the question of what consciousness means, was pursued by relatively few individuals [00:06:26].

A significant “political upheaval” in the field occurred when Marvin Minsky asserted that cognitive AI was synonymous with symbolic AI, actively delaying the development of neural networks and dynamical systems models for over a decade [00:06:54]. Minsky’s influence led to a division between “cognitive AI” (his followers) and other approaches, impacting funding and discourse [00:07:09]. This historical split contributed to a disconnect between those studying cognition and psychology and those focusing on image processing or environmental interaction, a division that persists today, though the gap is closing [00:08:07]. Despite this, many in the field, even at large companies like Google, initially entered AI with the grand vision of artificial general intelligence (AGI) [00:08:35].

Optimism for Advanced AI

Joshua Bach remains optimistic about creating computer intelligences that can reach or surpass human levels [00:10:26]. He believes there’s “nothing magic going on” in the brain and that most deep philosophical questions relevant to this endeavor have been sufficiently answered [00:10:37]. The remaining challenges are technical details, and progress in recent years suggests these doors are opening [00:10:53]. While predicting the exact timeline is difficult (five to 100 years or more), Bach sees a significant probability of seeing superhuman capabilities in certain domains within our lifetime, or even sooner [00:11:03].

The human brain, with limitations like working memory size and memory fidelity, may be “approximately the stupidest possible general intelligence” [00:11:58]. This perspective, rooted in cognitive science and neuroscience, suggests ample room for AI to surpass human capabilities, especially in areas like language understanding and processing large bodies of literature [00:12:11]. The idea that humans could have infinite memory or intelligence is a “superstitious belief” [00:13:18].

Computational Requirements

The computational power required for AGI is a key discussion point. While some suggest a human-level AI might cost as little as $20,000 in compute, others, like Ben Goertzel, propose a ceiling closer to a million dollars worth of hardware today (equivalent to a thousand powerful desktop computers) [00:14:36]. The exact cost depends heavily on the “appropriate level of representation” – whether it’s at the neuron, symbolic, or some hybrid level [00:15:07].

Current advanced models like GPT-2 and GPT-3, while not human-like or sentient, demonstrate remarkable capabilities [00:17:52]. Training these models involves “two-digit million dollars” and processes vast amounts of filtered internet data [00:16:34]. They can generate text and even images that pass a Turing test for human-like coherence [00:18:07]. This ability to process and extract meaningful correlations from enormous datasets in days or weeks is a “tremendous achievement” [00:17:11]. While GPT models are “extraordinarily deep pattern matching systems,” they lack a unified model of reality or a sense of their own role within it [00:19:43]. However, the coherence improvements in GPT-3 suggest progress in creating the illusion of a self-coherent universe within its generated text [00:21:38].

Regarding brain complexity, Bach suggests that the “mini column” (300-400 neurons) might be the closest generic unit for modeling intelligence, rather than individual neurons [00:24:38]. This would mean approximately 100 million such units in the brain, each capable of complex interactions, potentially fitting on a few large GPUs in real-time [00:25:14].

AI, Reality, and Models

The philosophical discussions around AI lead to fundamental questions about the nature of reality and the mind. Bach challenges the notion that “matter is immediately given” [00:31:18]. Instead, he suggests that matter is a way to talk about information, specifically how we measure “periodic changes in place” (matter) and their movement (momentum) [00:31:44]. Physics, in this view, describes how “adjacent states of the universe are correlated” [00:32:15]. This leads to the hypothesis of a “causally closed lowest layer” that describes the universe, a very successful hypothesis without a strong contender [00:32:27].

Idealism, which posits conscious experience as primary, faces the challenge of explaining the “outside” that “dreams us” [00:33:55]. If this “higher plane of existence” can be modeled with physics, it doesn’t fundamentally change the materialist perspective [00:34:37].

The concept of “mind” can be viewed as an emergent subjective state packaged within physics, arising from complex biological systems like brains [00:35:16]. Cultures focusing on this “mental perspective” or “inner structure of perception” might not delve into physics if they don’t aim to revolutionize society or production [00:36:38]. However, neglecting physics means “leaving money on the table,” as it provides models with “tremendous predictive power” for building machinery and advancing understanding [00:36:51].

Society and Virtual Realities

The concept of “spirit” can be reframed as an “operating system for an autonomous robot,” applicable not just to humans but also to plants, animals, cities, and nation-states [00:38:31]. Society’s operating system is “virtual”; it exists through the “coherent interactions of individuals” rather than as a physical entity [00:39:10]. Similarly, the mind exists over the coherent interactions of its constituent parts, like neurons, rather than as a physical “thing” [00:39:21]. This aligns with the complexity science perspective where “the dance” (emergent behavior) is as important as “the dancers” (individual components) [00:39:44].

Complex adaptive systems, like business companies, are “virtual” but “real” because they have “traction in the physical world” [00:45:07]. Such systems, organized by abstract concepts like money on money return, can exert “top-down causality,” influencing the behavior of individuals (atoms) [00:40:51]. The existence of legal systems defining companies provides a clear “substrate for the company to run in,” making them deterministic and precisely modeled [00:46:57]. This contrasts with minds, where the state is “somewhat probabilistic” [00:47:37].

Feedback Loops and “Memetic Viruses”

The notion of feedback loops is ancient, evident in Aristotle’s descriptions of competing forces in the mind [00:41:28]. It became central to cybernetics and control theory [00:42:01]. However, complex models based on statistical dynamics, like Newtonian or Einsteinian mechanics, are models of behavior rather than “real” underlying entities [00:43:08]. Our understanding of reality involves “layers of description” where coherent models can be made, forming hierarchies and causal relationships [00:43:52]. The mind-body problem, in this context, is about making “disparate categories of models” (body map and mental states) congruent [00:44:40].

The discussion extends to “magic” and “memetic viruses.” In the context of computer games, “magic” allows users to change the “rules of reality” beneath the surface layer [01:01:36]. In the real world, “magic” is about manipulating how people “construct reality” by changing their perceptions and relationship to it [01:02:02]. While symbolic rituals or “abundance meditations” can change an individual’s expectations and actions, leading to different outcomes, they don’t impact fundamental physics [01:04:11]. For example, one cannot change the mass of electrons or the spectral characteristics of Alpha Centauri through witchcraft [01:06:37]. This suggests a “causally closed” level of reality where magic is not possible [01:06:50].

“Mine viruses” or “memetic viruses” are radical ideas that challenge the status quo, like the scientific revolution or Darwinism [01:08:42]. They can be seen as “malware” that the human brain is susceptible to [01:08:53]. While such viruses can drive societal change, their impact on fitness is not always positive [01:52:48]. Homogeneity in society, such as reliance on the same social media or news sources, can make populations susceptible to “homogeneous virus infections” [01:53:30]. Societies might even domesticate their populations to be susceptible to certain “mind viruses” (e.g., through a church and an inquisition) to gain an evolutionary advantage [01:53:57].

Human-Animal Intelligence Gap

The significant cognitive gap between humans and other animals, despite small genetic differences, is a major area of inquiry. While large-brained animals like whales and elephants exist, they don’t display human-level intelligence, perhaps because highly intelligent systems are “hard to control” [02:28:15]. One theory suggests that elephants might suffer from “massive autism” or simply meditate themselves “out of existence” due to their intelligence, choosing not to reproduce [02:28:58]. Dolphins, constrained by their aquatic environment, might lack the means for intellectual development [02:49:47]. The ease of self-driving airplanes compared to cars highlights how environmental dimensions affect navigation complexity [02:30:33].

A key factor in human intelligence may be the “length of childhood” [01:17:08]. A longer maturation period, driven by brain development speed, allows for more training data and better abstractions [01:17:16]. This extended period of “exploration instead of exploitation” [01:17:49] results in a literally “insane” or incomplete model of reality in children, as their brains are still building foundational layers [01:17:59]. Humans spend more time learning basic spatial relationships, object permanence, and social interactions compared to animal counterparts [01:18:16]. This prolonged childhood, potentially an evolutionary adaptation linked to bipedalism and constricted birth canals, allows for significant brain development post-birth [01:20:03].

The role of symbols and language in this cognitive gap is also considered [01:22:34]. The human ability for “grammatical decomposition” and producing novel images, unlike elephants who merely reproduce learned strokes, suggests a symbolic representation at play [01:22:50]. Gorillas, even when raised in human-like environments, don’t exhibit grammatical decomposition in their drawings, indicating a lack of true symbolic depiction [01:23:44]. Symbols, being “tiny” and manipulable, allow for exponential effectiveness in the brain compared to processing images alone [01:24:43]. While the gap between human and ape intelligence isn’t sharp, it is substantial [01:25:01].

Cognitive Architectures

Cognitive architectures are a tradition in psychology influenced by cybernetics and AI, aiming to identify the structure and principles of the human mind [01:29:35]. The field recognizes two main directions: general principles of learning and functional approximation (how to build models from data) and the specific organization within the human mind for feats like language and social interaction [01:30:06].

Most machine learning research focuses on tasks rather than architectures, often using layered structures [01:30:53]. However, the brain is not organized into simple layers but into regions with “very complex interconnectivity,” akin to a city with diverse transport networks (local streets, long-range subways, and general interconnection like the thalamus) [01:31:16]. This complex, fully interconnected structure means that even visual cortex is not merely a first stage of processing, but stores textures [01:32:45]. While deep learning initially focused on feed-forward networks, researchers are aware of the desire for recurrences and feedback loops [01:33:07].

Early models like Boltzmann machines conceptually allowed for tight, optimal models, but their search space was too large to train [01:33:52]. Restricted Boltzmann Machines (RBMs) simplified connections, allowing for training but limiting modeling capacity, leading to the development of deep learning architectures [01:34:09]. Current transformer models, like those powering GPT, use “attention” and “self-attention” to bind features across dimensions into a relational graph [01:35:40]. This enables generating coherent text or images where distant elements are logically connected, implying an internal representation of the whole [01:35:50]. The simplicity and scalability of the transformer architecture make its potential terrifyingly vast [01:38:02].

Psy and MicroPsi

Dietrich Dörner, a German psychologist and cybernetician, independently developed many AI ideas in parallel with the main field [01:39:02]. Dörner was serious about building minds and incorporated “autonomous motivation” based on cybernetic feedback loops [01:40:41]. He even argued that his systems possessed “emotions” based on his own definition [01:41:51].

Joshua Bach systematized Dörner’s Psy theory, translating it into the MicroPsi architecture, a personal attempt to implement some of these concepts [01:42:40]. Bach’s book, Principles of Synthetic Intelligence, summarizes Dörner’s ideas, contrasts them with other AI theories, describes their implementation, and critiques them to suggest future directions [01:43:00].

MicroPsi, originally written in Java and later Python, has been used in startups for AI planning and controlling industrial robots [01:45:01]. While the architecture is publicly available, its main evolution now occurs within a company for proprietary projects [01:45:34]. Bach is currently using MicroPsi internally for research on spreading activation networks and considering the next edition of MicroPsi [01:45:43].

Looking ahead, future cognitive architectures might move beyond solely “vision systems” to integrated “across-systems” [01:47:19]. This involves different architectural parts implementing general principles for interacting with each other, leveraging existing tools like shader programs and graphics engines instead of reinventing low-level components [01:47:30]. For large-scale AI, integrating with existing distributed computing platforms like Apache Ignite, Flink, or Spark could allow for massive scale and leverage existing engineering efforts, enabling focus on higher-level value [01:46:44]. However, these platforms require systems that can self-organize to account for the realities of network traversal costs, otherwise, even powerful clusters can be inefficient [01:47:52].