From: jimruttshow8596

The field of AI development is experiencing rapid change, with new developments emerging at an accelerated pace, often compared to the personal computer revolution of the late 1970s and early 1980s, but “10 times faster” [00:01:23]. This acceleration is particularly evident in the AGI space [00:01:48]. The ongoing upheavals are expected to continue with larger magnitude and even greater speed, leading towards a potential singularity, as projected by Ray Kurzweil [00:01:53].

Current State of Large Language Models (LLMs) and AGI

Ben Goertzel, a leading authority on Artificial General Intelligence (AGI) and credited with coining the term [00:00:36], believes that current forms of Large Language Models (LLMs), primarily Transformer networks trained to predict the next token, will not lead to a “full-on human level AGI” [00:04:49]. However, he asserts that LLMs can perform “many amazing useful functions” [00:04:58] and serve as “valuable components of systems that can achieve AGI” [00:05:08].

LLM Limitations Driving AGI Infrastructure Research

LLMs exhibit several limitations that highlight the need for new architectural approaches to achieve AGI. These include:

  • Hallucination Problem: LLMs tend to “make up” information when asked relatively obscure questions [00:09:42]. While techniques like probing internal network states might filter these out for practical applications [00:11:13], this doesn’t address the underlying issue of lacking a human-like “reality discrimination function” [00:12:12].
  • Banality and Derivative Output: The natural state of LLM output is often described as “banality,” an average of every utterance [00:34:14]. While clever prompting can move them beyond their “centers,” they still struggle to match the “great creative human” [00:34:31].
  • Complex Multi-step Reasoning: LLMs lack the ability to perform complex, multi-step reasoning required for original scientific papers or advanced mathematical derivations [00:30:04].
  • Original Artistic Creativity: They struggle with fundamental aesthetic creativity needed for new musical styles or truly original songs [00:30:14]. LLMs recognize “surface level patterns” in data but don’t show strong evidence of learning human-like abstractions [00:32:33].

These limitations underscore that current LLMs are fundamentally “derivative and imitative” [00:33:18], needing human guidance for original seeding and curation [00:39:09].

Different Approaches to AGI Infrastructure Development

The push towards AGI involves various architectural strategies:

OpenAI’s Hybrid Approach

OpenAI is pursuing an AGI architecture where a “number different LLMs” act as a “mixture of experts” and serve as the “integration hub” [00:05:26]. This hub then calls upon other specialized systems like DALL-E or Wolfram Alpha [00:05:38].

Google/DeepMind’s Neural Net Universe

Google and DeepMind are well-suited to explore a “Gemini type architecture” [00:48:14]. This approach could involve combining:

  • AlphaZero: For planning and strategic thinking [00:48:18].
  • Neural Knowledge Graphs: Such as those found in Differential Neural Computing (DNC) [00:48:21].
  • Transformers with Recurrence: Reintroducing more recurrence into the network architecture, replacing attention heads with more sophisticated elements, as it’s an “obvious way to get interesting abstractions” [00:47:04].
  • Alternative Training Methods: Exploring methods like predictive coding-based training instead of backpropagation [00:47:36], or leveraging evolutionary learning algorithms [00:48:55] for more complex architectures.

OpenCog Hyperon’s Metagraph-Centric Design

Goertzel’s own project, OpenCog Hyperon, offers a contrasting approach, where a “weighted labeled metagraph” serves as the central “hub for everything” [00:05:57]. LLMs, DALL-E, and other neural networks act as peripheral components, feeding into and interacting with this core [00:06:08].

Key Features of the Hyperon Architecture:

  • Self-Modifying Metagraph: The core is a “big potentially distributed self-modifying self-rewriting metagraph” [00:55:57].
  • Knowledge Representation: It aims to represent various types of knowledge—apostolic, declarative, procedural, attentional, and sensory—within this hypergraph and linked representations [00:55:02].
  • Cognitive Operations: Different cognitive operations, such as reinforcement learning, procedural learning, logical reasoning, and sensory pattern recognition, are represented as “Little Learning programs” within the metagraph itself [00:55:15].
  • Meta Language: A new programming language, “Meta,” allows programs to be represented as “Sub metagraphs” that act on, transform, and rewrite chunks of the same metagraph in which they exist [00:55:33].
  • Reflection-Oriented: Unlike LLMs that predict tokens, OpenCog is designed for “recognizing patterns in its own mind, in its own process and its own execution traces” and representing those patterns internally [00:56:54].
  • Compatibility with AI Paradigms: It integrates various historical AI paradigms like logical inference and evolutionary programming, as well as new ideas such as “self-organizing mutually rewriting sets of rewrite rules” [00:57:55].
  • Path to Superhuman AGI: This architecture is considered less human-like initially but offers a “really short” path from human-level AGI to superhuman AGI, as the system is based on “rewriting its own code” [00:59:35].
  • Science and Creativity: It is well-suited for science due to its focus on logical reasoning and precise procedures [00:59:48], and for creativity through evolutionary programming [01:00:07].

Scaling OpenCog Hyperon:

The main challenge for OpenCog Hyperon is the “scalability of infrastructure” [01:00:43]. The project decided to rewrite the old OpenCog version from the ground up to address both usability and scalability issues [01:01:31].

Key components of the scalability pipeline include:

  • Compiler Pipeline: A compiler from Meta (Hyperon’s native language) to “Rang,” a language developed by Greg Meredith for “extremely efficient use of multiple CPU cores and hyper-threads” [01:02:05].
  • HyperVector Math: Translating Rang into “HyperVector math,” which deals with “very high dimensional sparse bit vectors” [01:02:26].
  • Specialized Hardware: Placing HyperVector math on “APU associative processing units” (e.g., from GSI) [01:02:31], rather than just GPUs.
  • Distributed Atom Space: The distributed atom space backend uses MongoDB to store atoms and Redis to store indexes, designed to scale as well as these databases [01:06:13].
  • Blockchain Integration: Integration with a secure, blockchain-based atom space module called RChain DB and the SingularityNET HyperCycle infrastructure allows for decentralized as well as distributed operations, without significant slowdown [01:06:53].

Goertzel likens this to the breakthrough of deep neural networks, which only realized their potential when “long existing algorithms hit hard infrastructure that would let them finally do their thing” [01:03:09]. The hope is that this scalable processing infrastructure for Hyperon will enable “ancient historical AI paradigms” like logical reasoning and evolutionary programming to operate at great scale [01:03:32].

“If my cognitive theory is right… this metagraph system representing different kinds of memory and learning and reasoning in this self-modifying metagraph… is conceptually a great route to AGI, then basically our obstacle to validating or refuting my hypothesis here is having a scalable enough system” [01:00:50].

Other Notable Players

  • Anthropic (Claude): Founded by ex-OpenAI/Google Brain individuals [01:03:00], Claude is noted for being “much better than GPT-4 in many things” [01:16:41], particularly in science, mathematics, and medicine domains, and at writing dialogue [01:16:50].
  • Character.ai: Described as number two in revenue after ChatGPT [01:36:00], it was founded by two Google Brain researchers [01:45:00].

The AGI race is “now genuinely” underway, with major companies investing significant resources [02:05:00]. This includes dedicated AGI teams within larger organizations, even if they are often piggybacking on teams doing more immediate, applied work [02:26:00]. The overall acceleration in AI also positively impacts non-LLM AGI projects, even if less transparently [01:05:24].