From: mk_thisisit
Despite significant advancements, current AI systems are described as “very stupid in many ways” because they lack understanding of the physical world, permanent memory, reasoning, and planning capabilities that humans possess [00:00:00]. These limitations highlight the path for the future of artificial intelligence development [00:05:50].
Current State and Limitations of AI
Currently, AI systems are proficient at manipulating language, leading to an overestimation of their intelligence [00:00:09]. However, they do not understand the physical world, lack permanent memory, and cannot truly reason or plan [00:00:13] [00:05:42]. These are considered key features of intelligent behavior [00:05:54].
A major challenge for AI is the Moravek paradox [00:14:50]. Computers can excel at complex tasks like chess or mathematical puzzles, but struggle with physical tasks such as manipulating objects or jumping, which animals perform with ease [00:15:05]. This is because the real world is far more complex than discrete symbol spaces, making it difficult for current AI techniques to operate [00:15:29].
Evolution of Deep Learning
The popularity of deep learning has occurred in two major waves [00:03:16]:
- Late 1980s to Mid-1990s: The first wave saw good results with multi-layer neural networks for tasks like image and handwritten character recognition [00:03:21]. Enthusiasm waned by the mid-90s due to the huge amount of high-quality data and expensive computing resources required, which were not readily available before the widespread internet [00:04:08].
- 2000s onwards: Interest gradually grew, leading to an “explosion” around 2013, as the research world realized deep learning’s effectiveness across many fields [00:04:42]. Since then, it has developed at a dizzying pace [00:05:02].
A foundational paper on deep learning from 2015, co-authored with Nobel Prize winner Jeff Hinton, served as a “manifesto” for the scientific community, presenting applications and future development tips, contributing to its popularization [00:01:13] [00:02:22].
Machine Learning Paradigms
Three main paradigms exist in machine learning:
- Supervised Learning: The most classic approach, where the system is given correct answers during training, like identifying a “table” in an image. It learns to generalize from many examples [00:08:51].
- Reinforcement Learning: The system receives feedback (good or bad) on its results, similar to how humans learn to ride a bike [00:09:51]. While effective for games like chess or Go, it is “extremely ineffective” in the real world due to the vast number of trials required, making it impractical for tasks like training autonomous cars [00:10:17].
- Self-Supervised Learning (Samad Learning): This paradigm, responsible for recent advances in natural language understanding and chatbots, trains the system to capture the structure of input data (e.g., text) by predicting missing elements [00:10:55]. Large language models (LLMs) are trained on this principle, predicting the next word in a sequence [00:11:38].
While self-supervised learning works incredibly well for language, it struggles with the physical world [00:12:27]. The physical world is continuous and involves too many unpredictable details, unlike discrete language symbols [00:13:35] [00:13:51] [00:47:47].## Future Implications and Applications of AI Technology
Currently, AI systems are described as “very stupid in many ways” [00:00:00]. Despite their ability to manipulate language very well, they lack understanding of the physical world, do not possess permanent memory like humans, and struggle with reasoning and planning [00:00:09] [00:05:42] [00:05:50]. These limitations highlight critical areas for the future of artificial intelligence development.
Current State and Limitations of AI
While AI excels at language manipulation, this capability often leads to an overestimation of its general intelligence [00:00:09] [00:05:37]. Fundamental gaps remain in AI’s ability to comprehend the physical world, retain long-term memory, and perform complex reasoning or planning [00:05:45] [00:05:50]. These aspects are considered essential for truly intelligent behavior [00:05:54].
A key challenge for AI is Moravek paradox, which observes that computers struggle with physical tasks that are easy for humans or animals, such as manipulating objects, despite excelling at complex logical or mathematical problems [00:14:50] [00:15:05]. This difficulty arises because the real world’s complexity, with its continuous and unpredictable nature, differs significantly from the discrete, symbolic spaces where AI typically operates [00:15:29].
Evolution of Deep Learning
The field of deep learning has experienced two major waves of enthusiasm and development:
- Late 1980s to Mid-1990s: The first wave saw promising results using multi-layer neural networks for tasks like image and handwritten character recognition [00:03:21]. However, this initial excitement subsided due to the immense requirements for large, high-quality datasets and expensive computing power, which were not readily available at the time [00:04:08].
- 2000s Onwards: Interest in deep learning gradually resurfaced, culminating in a “real explosion” around 2013 [00:04:42]. The research community recognized the widespread applicability and effectiveness of deep learning, leading to rapid development ever since [00:05:02].
A significant publication in 2015, co-authored by Jeff Hinton, served as a “manifesto” for the scientific community, outlining deep learning’s applications and future directions, contributing to its popularization [00:01:13] [00:02:22].
Machine Learning Paradigms
Three primary paradigms define machine learning:
- Supervised Learning: This classic method involves training a system by providing correct answers, such as labeling images [00:08:51]. The system adjusts its parameters to align its output with the expected result, developing the ability to generalize to unseen but similar data [00:09:08].
- Reinforcement Learning: Here, the system learns through feedback on whether its actions were good or bad, akin to how humans learn to ride a bicycle [00:09:51]. While effective for games (e.g., chess, Go) where systems can play millions of games against themselves, this method is “extremely ineffective” in complex real-world scenarios, making it impractical for tasks like training autonomous cars without thousands of crashes [00:10:17].
- Self-Supervised Learning (Samad Learning): This paradigm has driven recent breakthroughs in natural language understanding and chatbots [00:10:55]. Systems are trained to capture the inherent structure of input data (e.g., text) by predicting missing words. Large language models (LLMs) operate on this principle, predicting subsequent words in text sequences [00:11:38].
Although self-supervised learning is highly successful for textual data, it faces significant challenges when applied to the physical world [00:12:27]. The physical world is inherently continuous and unpredictable, unlike the discrete, symbolic nature of language, making accurate prediction of future events difficult or impossible [00:13:35] [00:47:47].
Future Prospects and Challenges
The next significant step for AI involves designing systems capable of functioning in the physical world, possessing permanent memory, and performing reasoning and planning [00:06:09] [00:06:16]. Such systems are expected to exhibit emotions like “excitement or joy,” related to predicting successful goal achievement [00:06:25]. However, they will not be built with inherent negative emotions such as anger or jealousy [00:07:11].
A critical challenge is developing AI that can understand complex sensory data, particularly sight, which is essential for machines to learn as effectively as humans and animals [00:16:40] [00:16:54]. The amount of visual information a child processes in its first four years of life is comparable to the entire textual data used to train the largest language models (approximately 10^14 bytes), indicating that text-only training is insufficient for human-level AI [00:18:11] [00:18:38].
Reasoning and Planning
For AI systems to reason and plan, they need to develop abstract representations from observations [00:24:43]. Current large language models often employ a “primitive” search strategy in “token space,” generating numerous hypothetical sequences and then selecting the best one [00:26:41]. This method is computationally expensive and differs from human cognitive processes, where reasoning occurs in an internal mental state rather than by generating and analyzing many external actions [00:27:14] [00:28:11].
A key future development involves implementing hierarchical planning in AI, allowing systems to define intermediate goals to achieve a larger objective, similar to how humans break down complex tasks [00:29:25] [00:30:36]. This is a significant challenge for the coming years [00:30:44].
Consciousness
The concept of consciousness remains undefined in science, with no measurable indicator to determine its presence [00:07:27]. The speaker suggests that the difficulty in defining consciousness might stem from asking the wrong questions, likening it to historical confusion over how the brain perceives an inverted retinal image correctly [00:22:29]. While the world is “obsessed” with consciousness, it may be a uniquely human characteristic that makes individuals different, rather than a quantifiable attribute [00:22:18] [00:23:31].
Information and Entropy
The quantification of information is a fundamental challenge across computer science, physics, and information theory [00:19:24]. The amount of information extracted from a message or sensory data is not absolute but depends on the interpreter [00:19:43]. This relativity implies that many concepts in physics, like entropy (a measure of ignorance about a physical system’s state), lack truly objective definitions [00:20:31] [00:20:38].
Applications of AI Technology
The future of AI promises widespread applications, particularly in robotics and smart devices.
Robotics
While industrial robots are common for repetitive tasks in controlled environments (e.g., painting cars), autonomous systems like self-driving cars still fall short of human reliability [00:31:12] [00:31:59]. Elon Musk’s repeated predictions of achieving level five autonomy for Tesla in five years have not materialized, indicating the complexity of real-world physical tasks [00:32:19] [00:32:25]. The “coming decade” is anticipated to be the “decade of robotics” [00:34:42]. Many humanoid robot companies are betting on rapid AI advances in the next 3-5 years to make their physically capable robots intelligent enough to navigate the real world [00:34:10].
Smart Devices and Medical Applications
The near future envisions billions of people using AI assistance daily through smart glasses or smartphones [00:42:17] [00:42:51]. Such devices, like Meta’s smart glasses, already feature AI systems that can answer questions or identify objects from camera images [00:42:40].
Deep learning methods show immense promise in medical applications, particularly in diagnostics like breast cancer mammography [00:57:17] [00:57:22]. Startups like Ataris, founded at New York University, aim to leverage AI for breast cancer prediction, moving beyond just diagnosis to influence treatment [00:57:02] [00:58:21].
Computing Infrastructure and Investment
The widespread adoption of AI assistance will necessitate massive computing infrastructure. Running large language models and other AI systems for billions of users requires significant computing power, with most investments (e.g., Meta’s 80 billion, and the rumored Stargate project’s $500 billion over 5-10 years) directed towards inference costs (running models) rather than training them, which is comparatively cheaper [00:43:12] [00:43:31] [00:44:03].
Open Research and Collaboration
Open research and open-source software are critical drivers of AI progress [00:36:34]. When research and code are published openly, the entire global community benefits, leading to faster development and innovation [00:36:10] [00:37:01]. Meta is a strong supporter of this approach, with its Paris lab creating foundational models like LLaMA [00:37:40] [00:37:42]. The prevalence of PyTorch, an open-source software created by Meta and now managed by the Linux Foundation, across the AI industry and academia, demonstrates the power of collaborative development [00:39:37].
The speaker emphasizes that good ideas emerge globally, and no single institution or region holds a monopoly [00:38:07] [00:41:41]. Open cooperation, rather than perceived competition, accelerates progress in the field [00:42:06].
Europe’s Role in AI
Europe holds a significant position in the global AI landscape, largely due to its talent pool of programmers, mathematicians, physicists, and engineers [00:53:01]. Many leading AI scientists globally originate from Europe [00:53:13]. However, regulatory uncertainty within the European Union has hindered the full deployment of certain AI applications, such as the vision function in Meta’s smart glasses [00:52:02] [00:52:41].
Personal Reflections on AI Development
One regret expressed is a delayed interest in self-supervised learning until the mid-2000s, about ten years later than it should have been [00:54:20] [00:54:57]. This period saw a “drought” in neural network and deep learning research interest [00:55:05]. Another reflection is on the work on convolutional neural networks (CNNs), invented in 1988, which are fundamental to modern AI applications ranging from character recognition and driver assistance systems to speech recognition and plant identification apps [00:50:07] [00:50:49] [00:51:20].