From: jimruttshow8596

This article explores the distinction between current Generative Artificial Intelligence (AI), particularly large language models (LLMs), and Artificial General Intelligence (AGI), based on insights from leading AGI authority Ben Goertzel [00:35:55]. The discussion highlights the rapid advancements in the AI space, which Goertzel compares to the personal computer revolution but at ten times the speed [01:23:00]. This exponential acceleration, as projected by Ray Kurzweil, is occurring differentially across various domains [01:53:00].

Ben Goertzel’s Core Thesis on LLMs and AGI

Ben Goertzel’s fundamental thesis, as outlined in his paper “Generative AI versus AGI: the cognitive strengths and weaknesses of modern LLMs,” states that current LLMs (Transformer Nets trained to predict the next token) will likely not lead to full human-level AGI on their own [04:34:00]. However, he is bullish that these systems can perform many amazing and useful functions, potentially even passing the Turing test [04:58:00]. More importantly, LLMs can serve as valuable components within systems designed to achieve AGI [05:08:00].

A key distinction is whether LLMs serve as the “hub” or a “supporting role” in a hybrid AGI system [05:55:00]. For instance, OpenAI’s AGI architecture might use multiple LLMs as an integration hub, calling upon other non-LLM systems like DALL-E or Wolfram Alpha [05:22:00]. In contrast, Goertzel’s own approach with OpenCog Hyperon positions a weighted labeled metagraph (AtomSpace) as the central hub, with LLMs interacting on the periphery [05:57:00]. This highlights a finer-grained distinction often overlooked in the polarized AGI field [06:35:00].

Goertzel predicts a 60/40 chance of achieving AGI within five years [03:19:00]. He emphasizes that “LLM plus plus” (LLM with external tools) will likely not lead to human-level AGI, but “something plus LLM” might achieve AGI faster than ignoring LLMs entirely [08:01:00].

Current Limitations of LLMs

Despite their capabilities, LLMs exhibit several fundamental limitations that differentiate them from human-level AGI:

Hallucination Problem

LLMs are known to “hallucinate” or make up facts, especially for obscure queries [09:39:42]. While improvements are being made, such as using probes to detect hallucination signatures within the network, this doesn’t signify true understanding or “reality discrimination” like humans possess [11:12:00]. Human reality discrimination involves reflective self-modeling and understanding, a capability LLMs lack [12:12:00]. One brute-force method to reduce hallucinations is to run queries multiple times, as correct answers tend to have different entropy than incorrect ones [13:52:00].

Banality and Lack of True Creativity

The natural output of LLMs often exhibits “banality” [34:14:00]. While clever prompting can push their output beyond the average, it generally does not reach the level of great human creativity [34:31:00]. LLMs can produce first drafts of movie scripts or decent blues guitar solos [35:00:00], but they are not close to generating truly original artistic works or inventing new musical styles like Thelonious Monk or Jimi Hendrix [29:13:00].

Inability for Complex Multi-Step Reasoning and Original Science

LLMs struggle with complex multi-step reasoning required for writing original scientific papers or inventing new theories like quantum gravity [30:04:00]. While they can “turn the crank” on advanced mathematical concepts if given an initial idea, they lack the deep judgment and aesthetic sense of mathematics to discern truly interesting definitions or theorems [38:48:00]. Fundamentally, LLMs recognize primarily surface-level patterns in data and do not seem to learn abstractions in the way humans do [32:33:00]. Their character is “fundamentally derivative and imitative,” which limits their capacity for fundamental surprise or radical leaps beyond known information [33:18:00].

Defining AGI and Its Benchmarks

There is no universally agreed-upon definition for AGI [21:37:00]. One mathematical approach, formalized by Marcus Hutter and DeepMind co-founder Shane Legg, defines AGI as the ability to achieve a huge variety of computable goals across diverse computable environments [21:49:00]. However, this definition suggests humans are “complete retards” at optimizing arbitrary reward functions [23:22:00].

Other philosophical views, like Warren Weaver’s theory of open-ended intelligence, suggest intelligence is about complex self-organizing systems maintaining existence, individuating, and self-transforming [24:15:00].

For human-level AGI, the focus shifts to capabilities people are good at. While IQ tests are imperfect, more multifactorial views like Gardner’s theory of multiple intelligences (musical, literary, logical, etc.) offer a closer approximation [25:49:00].

The Turing Test, which assesses an AI’s ability to imitate human conversation, was never a robust measure of general intelligence, as fooling people can be “disturbingly easy” [26:31:00]. Ben Goertzel proposed the “MIT student test” (a robot passing MIT classes with good grades) and the “Berklee School of Music test” (becoming a jazz guitar player and getting laid at the bar) as benchmarks [27:48:00]. However, even these have limitations, as they don’t necessarily require frontier-pushing creativity or original composition [29:06:00].

Humans’ ability to abstract is guided by their “agentic nature,” which involves survival, reproduction, and self-transformation within an environment [43:18:00]. This leads to the development of heuristics – compact summaries of tactics that generate new solutions [44:08:00]. This interplay between abstraction and heuristic development is crucial for human intelligence.

Future Directions for AGI Development

To overcome the limitations of LLMs and achieve AGI, several architectural and training approaches are being explored:

Neural Network Innovations

One promising path involves introducing more recurrence into Transformer networks, similar to LSTMs, which were largely stripped out for scalability [46:43:00]. Recurrence is key for generating interesting abstractions [47:04:00]. Alternatives to backpropagation for training, such as predictive coding-based methods, could also be explored, especially for richly recurrent networks [47:36:00].

Hybrid Architectures

Integrating different deep neural network architectures is another avenue. An example is a “Gemini-type architecture” that combines something like AlphaZero (for planning and strategic thinking) with a neural knowledge graph (like in differential neural computing) and a Transformer, potentially with added recurrence [48:14:00]. Google’s DeepMind is ideally suited for this due to their expertise in DNC, AlphaZero, and evolutionary learning [48:42:00]. Yoshua Bengio’s group also explores combining Transformers with neural nets that perform minimum description length learning, explicitly seeking abstractions [49:31:00].

OpenCog Hyperon’s Approach

Goertzel’s OpenCog Hyperon project is a framework for AGI that aims to achieve human-level intelligence and beyond [54:11:00]. Its core component is a “weighted labeled metagraph” (AtomSpace) [54:38:00]. This metagraph is:

  • Hypergraph: Links can span multiple nodes [54:41:00].
  • Metagraph: Links can point to other links or subgraphs [54:46:00].
  • Typed and Weighted: Each link has a type (represented as a sub-metagraph) and numerical weights [54:51:00].

This framework represents various types of knowledge (declarative, procedural, attentional, sensory) and cognitive operations (reinforcement learning, logical reasoning, pattern recognition) as “little learning programs” within the hypergraph itself [55:02:00]. A new programming language called MeTTa (Meta-Type Talk) is used, where programs are sub-metagraphs that act on, transform, and rewrite chunks of the same metagraph in which they exist [55:33:00].

The core philosophy is that a mind is a system for recognizing patterns in the world and itself, including patterns in its own processes and execution traces [56:33:00]. Unlike LLMs, which are not ideally suited for recognizing patterns within themselves, OpenCog Hyperon is reflection-oriented due to its self-referential metagraph structure [57:11:00].

Hyperon lends itself well to:

In this architecture, LLMs can exist at the periphery, supporting the central self-rewriting metagraph [58:37:00]. This approach is distinct from LLM-centric systems (like OpenAI’s) or constellations of deep neural nets (like DeepMind’s) [58:44:00]. Goertzel believes it offers a shorter path from human-level AGI to superhuman ASI because the system is designed to rewrite its own code [59:17:00]. It is also well-suited for science due to its capacity for logical reasoning and precise description of repeatable procedures [59:48:00].

The main challenge for OpenCog Hyperon is scalability of infrastructure [01:00:40]. The project involves building a new version from the ground up to address usability and scalability issues [01:01:31]. This includes a compiler from MeTTa to a language called Rholang (developed by Greg Meredith) for efficient use of CPU cores, which can then translate into hypervector math for specialized hardware like associative processors [01:02:05]. This pipeline aims to provide the scalable plumbing needed to validate if OpenCog Hyperon can leverage logical reasoning and evolutionary programming at scale [01:02:51].

The “AGI race” is genuinely underway, with major companies investing significant resources [01:05:56]. However, Goertzel argues that even with substantial funding, the “weird little lore” of tuning complex AI systems makes it difficult to replicate specialized architectural successes quickly [01:09:12]. The Hyperon project is progressing well on technical milestones, benefiting from increased funding and improved tooling [01:05:54]. The distributed AtomSpace, built by Andre Senna and based on MongoDB and Redis, is designed to scale effectively for this project [01:06:03].