From: jimruttshow8596

Jim Rutt hosts David Krakauer, President of the Santa Fe Institute (SFI) and William H. Miller Professor of Complex Systems, to discuss the evolving landscape of science, particularly the distinction between theory-driven and data-driven approaches, and the implications for complexity science [00:00:31].

Theory vs. Data-Driven Science

Krakauer suggests that science is approaching a “bifurcation,” not necessarily between theory and data, but between “fine-grained paradigms of prediction” and “coarse-grained paradigms of understanding” [00:02:31]. Historically, physical science was fortunate that fundamental theories often proved very useful [00:03:05]. However, a new era may be emerging where understanding a topic might require two distinct approaches: one for understanding and another for practical application [00:03:31].

Triumphs of Data Science Over Theory

Several examples highlight this shift:

  • AlphaFold: This system revolutionized protein folding prediction, a long-standing “Holy Grail” problem in computational chemistry [00:04:04]. AlphaFold achieved vastly superior results using “brute force computation” but provided “zero theoretical insight” into the underlying mechanisms [00:04:51].
  • Large Language Models (LLMs): Traditional computational linguistics struggled to understand the “only marginally lawful nature of human language” through rule-based parsers [00:05:10]. In contrast, “simple-minded technology,” “brute force data,” and “brute force computation” have produced “unbelievably powerful language models” that offer “little insight” into the mechanisms initially [00:05:38]. These models, such as GPT-4, have trillions of parameters, making them “Way Beyond human understanding” [00:13:00].

Historical Context: Induction and Neural Networks

The concept of induction, as articulated by Hume in the 18th century, suggested that humans could not understand the world through deduction alone [00:06:41]. The development of statistics in the 1920s by figures like Fisher, focused on reducing error in measurements and parameter estimation, laid an inductive mathematical framework [00:07:02].

Intriguingly, early neural networks in the 1940s, originating from the collaboration between neurophysiologist Warren McCullough and logician Walter Pitts, were primarily “deductive frameworks” about logic, not induction [00:06:31]. It was much later, in the 1970s and 1980s, that neural networks began to fuse with statistical, inductive approaches, eventually leading to their current form with “big data and GPUs” in the 1990s [00:10:37].

Superhuman Models and High Dimensionality

David Krakauer coined the term “superhuman models” to describe models that perform well after crossing a “statistical uncanny valley” [00:14:49]. This valley refers to a range where adding more parameters leads to worse out-of-sample performance [00:14:40]. However, if parameters continue to be added, these models eventually perform well again, effectively solving the problem of induction that Hume identified [00:14:40].

This phenomenon is tied to the nature of complex systems:

  • High-Dimensional Regularities: Complex phenomena exhibit regularities, but these are often “fundamentally High dimensional” [00:14:45].
  • Ultra-High Dimensionality Advantage: The “miracle of ultra high dimensionality” allows gradient descent (a core optimization technique for deep learning) to work effectively, even on functions previously thought problematic [00:16:03]. Problems like “local minima,” which were a significant critique of adaptive computation and evolution, are ameliorated in high dimensions, not made worse [00:17:30].

The Future of Scientific Discovery

This new paradigm suggests a novel approach to science. Krakauer proposes using large neural networks as “pre-processors for parsimonious science” [00:27:39]. This involves:

  1. Training a Neural Network: The neural network acts as a “surrogate for reality” [00:30:50].
  2. Sparsification: Connections within the network are mutilated or removed to find the “absolute minimum that they can sustain as predictive engines” [00:42:57].
  3. Symbolic Regression: A genetic algorithm is run on the sparsified network to “produce formulas” or algebraic equations that encode the regularities [00:29:36]. This approach has led to discoveries, such as new parsimonious encodings for dark energy behavior [00:29:55].

This implies two classes of models: the large neural network for prediction and a smaller, sparsified model for understanding [00:30:59].

Complexity as Incompressible Representations

Complexity science and deep learning both deal with “incompressible representations” [00:41:30]. While simple rule systems with symmetries typify the physical world, the complex world involves “elaborate rule systems and broken symmetries and evolving initial conditions” [00:42:01]. Historically, complexity science has managed this by “taking averages” or looking at bulk properties, but neural networks may allow for “slightly less Draconian averages” [00:42:38].

Constructs and Schemas

David Krakauer defines complex systems as entities that “encode reality” and “encode history” [00:48:10]. Unlike a rock, which contains no information about its environment, a microbe or human brain holds “simulacra” or “mirrors of reality,” reflecting 3.5 billion years of history [00:48:18]. These are “parsimonious encodings” or “schemas” of reality [00:48:51]. In this sense, a complex system is a “theorizer of the world,” and training a deep neural network is akin to creating such a theory or “organism” [00:50:27].

Meta-Occam

Occam’s Razor, the principle of choosing the simplest explanation, is a heuristic widely applied in physical sciences [01:09:36]. However, in the complex world, which is “full of these horrible High Dimensions which seem irreducible,” Occam’s Razor “doesn’t seem to apply” to the final object [01:10:41].

Krakauer introduces Meta-Occam: “There are domains of human inquiry where the parsimony is in the process not in the final object” [01:11:19].

  • Physics: Has parsimonious theories of objects (e.g., the atom) [01:12:57].
  • Complexity Science: Seeks “parsimonious theories for generating complicated objects” [01:13:30]. Examples include evolution by natural selection and reinforcement learning, which are simple processes that can produce arbitrarily complex objects [01:13:50].

This reflects a fundamental difference: physical theory has “infinite models for minimal objects,” while complexity science has “parsimonious metal outcome” (processes) but “very non-parcimonious objects” [01:14:50]. Thus, complexity science can be seen as “the search for meta Alchemist processes” [01:15:08].

AI and Societal Risks

The discussion also touched on the perceived “existential risk” of AI [01:17:21]. While acknowledging that more powerful machine intelligence will likely emerge, the current “overheated speculations” about AI’s capabilities are seen as marketing [01:18:27].

More immediate and tangible risks include:

  • Misuse of Narrow AI: Examples include the development of sophisticated police states [01:19:03].
  • Idiocracy Risk: As AI takes over more tasks, humans may “stop up investing in achieving those intellectual skills,” potentially leading to a societal devolution [01:20:01].
  • Acceleration of “Game A”: AI could accelerate current societal trajectories, potentially bringing humanity closer to global catastrophic risks faster [01:21:20].

However, historical precedent from technologies like genetic engineering, nuclear weapons, and automobiles suggests that humanity has a track record of developing “small regulatory interventions” that minimize risks [01:24:05]. The proliferation of AI could lead to the development of “info agents” that filter the “flood of sludge” online, creating a “constructive network of mutual curation” [01:28:47].