From: jimruttshow8596

George Hotz, a notable figure in the tech world with a history of hacking the iPhone and PlayStation 3, and working at Google’s Project Zero, shifted his focus to AI, driven by a desire to automate vulnerability finding [00:03:05]. This ambition eventually led him to found comma.ai, a company focused on open-source self-driving car systems [00:03:49].

Founding and Motivation

The inception of comma.ai stemmed from a failed contract with Elon Musk to develop software for Tesla to replace the Mobileye chip [00:04:41]. Mobileye produces chips that utilize proprietary perception algorithms for Advanced Driver-Assistance Systems (ADAS) features like perceiving lane lines and cars [00:05:32]. Despite the contract not working out, Hotz proceeded to build an autopilot clone with the intention of selling it to other car manufacturers [00:05:13]. While building the clone was relatively quick, selling it to car companies proved “impossible” [00:05:22].

comma.ai’s Technical Approach

Camera-Only Philosophy

comma.ai champions a camera-only approach to self-driving, asserting that since humans drive effectively with only two cameras (eyes), additional sensors like Lidar are not strictly necessary [00:06:10] [00:06:41]. This contrasts with systems like Waymo, which traditionally rely on Lidar [00:06:07].

Evolution of the Driving Model

Initially, comma.ai’s system attempted a direct supervised learning approach: predicting steering wheel angle from camera images (f(x)=y, where x is image and y is steering angle) [00:10:49]. However, this “behavioral cloning” method failed in real-world driving due to the accumulation of “Epsilon errors” over time, as the model’s actions influence subsequent inputs, breaking the Independent and Identically Distributed (IID) assumption of training data [00:11:18] [00:12:26].

To mitigate this, a “corrective pressure” mechanism was introduced, using lane line detection to subtly adjust steering [00:15:15]. Later, the system evolved to remove reliance on lane lines due to their ambiguous definition, instead learning to predict where a human would drive the car based on training data [00:15:45] [00:17:04].

Simulation and Data

A key aspect of comma.ai’s development is its “small offset simulator,” which applies geometric perturbations to real human-driven video data, allowing the model to learn corrective pressure by simulating how its own actions would affect the driving scenario [00:18:40] [00:19:50]. This contrasts with traditional game-engine simulators, which face challenges in modeling the behavior of other cars [00:20:09].

comma.ai boasts the second-largest driving dataset globally after Tesla, with tens of millions of miles collected from over 10,000 weekly active users across diverse locations [00:19:12] [00:20:20]. Despite this vast dataset, they only train on about 5% of it due to diminishing returns and to facilitate faster iteration [00:41:18].

The comma.ai Product: openpilot

openpilot is an open-source self-driving system that can be installed in compatible cars (around 275 models) [00:57:07] [00:21:04]. The installation process is simple, typically taking 15-30 minutes, involving a Y-splitter cable connected to the car’s existing camera behind the rearview mirror [00:21:40] [00:21:52].

Functionality

openpilot operates as a Level 2 driver assistance system [00:17:00] [00:43:14]. It intercepts messages from the car’s camera and sends modified commands to the steering and braking systems [00:21:59]. By default, emergency braking features are not disabled [00:22:19]. Its primary function is to provide lane centering and adaptive cruise control, aiming to keep the car where a human would place it [00:22:46] [00:22:50].

The system is particularly effective on interstate highways, where it can drive for an hour or more without human intervention [00:24:53] [00:25:00]. An experimental mode allows for navigation in urban environments, handling stop signs, traffic lights, and turns, comparable to early versions of Tesla’s Full Self-Driving (FSD) [00:25:07] [00:25:12].

No High-Precision Mapping

comma.ai does not use high-resolution maps, deeming them “worthless” [00:25:56]. Instead, it relies on standard definition maps similar to those humans use, aligning with the philosophy that computers don’t need “special stuff” [00:26:09] [00:26:15]. This contrasts with Waymo’s approach, which maps defined operational regions with high precision [00:26:50].

comma.ai operates within the self-certification model of the U.S. automotive industry, complying with safety standards like ISO 26262 [00:42:43] [00:42:53]. As a Level 2 system, the human driver remains fully liable for the vehicle’s operation [00:43:14] [00:43:17]. The system is designed to allow human override of steering and braking at all times [00:43:24].

comma.ai emphasizes the importance of driver monitoring, using an in-car camera to ensure the user’s eyes remain on the road [00:44:01]. This system is designed to be non-intrusive and provide helpful alerts, avoiding “alert fatigue” [00:44:53] [00:45:00]. User telemetry data is opt-out, contributing to the system’s improvement [00:45:37].

Regarding liability, comma.ai maintains that if a crash occurs, the human driver is responsible [00:36:36] [00:50:06]. Their terms of service clearly state user indemnification for liability [00:52:33]. While they distinguish between a driver’s decision and a product malfunction, they assert that the product is designed to not compromise the car’s core safety features [00:53:22] [00:54:07].

Comparison with Competitors

Waymo and Cruise

Waymo and Cruise, while logging millions of miles in no-driver mode, are described as “fancy remote control cars” or “trackless monorails” due to their reliance on remote human operators and centralized infrastructure (like cell phone networks) [00:26:26] [00:29:29] [00:30:11]. Their economic models, relying on expensive Robo-taxis and the assumption of fixed transportation costs, are seen as unsustainable given the rapid technological advancements and competition from cheaper, open-source alternatives [00:31:50] [00:32:09].

Tesla

Tesla and comma.ai share a key business model: selling products to consumers profitably today [00:34:11] [00:34:40]. Both also perform processing locally on the device, rather than relying on constant cloud connectivity [00:39:07] [00:40:51].

However, their autonomy approaches differ:

  • Tesla: Views driving as a “fiscus problem,” often using rigid maneuvers and displaying virtual 3D representations of cars, indicating a focus on state localization [00:34:26] [00:34:55].
  • comma.ai: Focuses on a “holistic” approach, predicting human actions (“where does a human drive the car?” “when does a human hit the brakes?”) rather than detailed state representation [00:34:45] [00:34:57].

Functionally, Tesla generally leads in high-end capabilities (e.g., navigating complex intersections) but can be less “chill” in its execution, with jarring corrections and phantom braking [00:36:01] [00:36:22] [00:37:06]. comma.ai, while having fewer complex features, prioritizes smooth and user-friendly driving, with gentler, more “humanlike” failure modes [00:36:40] [00:37:10] [00:38:52]. Tesla also uses significantly more computing power (around 100x) for both training and in-car processing [00:35:25] [00:35:46].

Challenges and Advancements

A common criticism of self-driving cars, particularly by figures like Gary Marcus, is the inability to handle the “zillion corner cases” [00:12:43]. Hotz counters that with massive datasets (hundreds of millions of miles), even rare corner cases are sufficiently represented [00:13:00]. He argues that human drivers, despite seeing less data, excel due to their generalized intelligence [00:13:08]. The challenge is not corner cases but rather the system’s “world model” and ability to integrate clues, which remains a cutting-edge machine learning problem not fully deployed in self-driving cars [00:13:58] [00:14:56].

The U.S. traffic fatality rate (approx. one fatality per 100 million driven miles) highlights that humans are “absurdly good drivers,” setting a high bar for autonomous systems [00:08:52] [00:09:51]. comma.ai has logged over 100 million miles with zero fatalities [00:09:31].

Future Vision

comma.ai’s ultimate goal is not merely to solve self-driving cars but to use it as a stepping stone to general purpose robotics and “artificial life” [00:39:54] [00:40:18] [00:40:21]. They envision a “comma body” – a $25,000 robot companion capable of cooking and cleaning [00:47:28].

Hotz sees self-driving as a challenging but manageable narrow AI problem, distinct from general robotics due to the ease of data collection and the low dimensionality of the driving task (steering and acceleration) compared to, for example, the complex degrees of freedom in a human hand [00:46:17] [00:46:56] [00:47:01].

Despite the focus on self-driving, Hotz is also the CEO of Tinygrad, a significantly simpler (5200 lines of code) machine learning framework that competes with TensorFlow and PyTorch, capable of running complex models like Stable Diffusion and Llama [00:55:55] [00:56:17]. Tinygrad is used in openpilot to run the on-device model and aims to enable the creation of machine learning ASICs [00:57:12] [00:57:29].