From: mk_thisisit
The integration of artificial intelligence (AI) algorithms with robotics is a rapidly advancing field, moving in an unknown direction regarding its ultimate outcome [00:01:01]. This field has seen significant development, particularly in recent years [00:03:48].
Applications in Logistics and Warehousing
Professor Marek Cygan, a professor at the University of Warsaw and winner of the Google Code Jam programming competition, works at Nomagic, a company that programs robots to move products in logistics centers and warehouses [00:31:00]. These robots are designed to pick up, pack, and send items [00:37:00].
The primary challenge in this area is enabling robots to handle a wide variety of products, regardless of their shape, appearance, or material [00:04:00]. Previously, solving such problems was not possible, but now it is achievable with the help of machine learning methods [00:04:28]. Robots can now grasp, transfer, and pack diverse items [00:04:22].
While automating tasks like picking up easily manipulable, tightly closed boxes is straightforward [00:05:25], challenges arise when items are irregularly packed or boxes can open [00:05:33]. The degree of automation in logistics centers varies by country, depending on labor costs [00:04:40]. Automating warehouse work, often described as noisy and unpleasant, is seen as a beneficial move, as it replaces jobs people may not desire for their children [00:12:06].
Robot Senses and Perception
For robots to perceive the world, they must be equipped with sensors, such as cameras or microphones, to collect data from the outside world [00:06:17]. This data is then processed by a computer program, allowing the robot to make decisions [00:06:28]. The tools for processing visual and auditory data have significantly advanced, enabling robots to identify objects in pictures or recognize people [00:06:46].
While vision and hearing capabilities are well-developed, the sense of touch is more challenging, both from a hardware and data collection perspective [00:07:18]. Progress in robotics often aligns with technological developments driven by consumer demand, making components like cameras cheaper and better [00:07:34]. However, for touch sensors, this “scale effect” is still in its initial stages, making devices expensive and data collection more difficult, as it often requires active interaction with the environment [00:08:09]. The sense of smell in robotics is an area Professor Cygan is not familiar with, noting challenges in digital processing and biological/chemical safety [00:09:38].
According to Professor Cygan, robots already possess “senses” in the form of cameras and other sensors that collect data, which algorithms then process to understand and interact with the world [00:22:22].
Autonomy and Decision-Making
The level of autonomy granted to robots depends on the potential consequences of their decisions [00:12:50]. For tasks with minor consequences, like knocking over glass, developers are less hesitant to allow autonomous decisions [00:13:01]. However, for applications like autonomously controlled vehicles, where consequences are far more serious, authorization processes are much more cautious [00:13:15].
The technological capacity to create autonomous robots today exists for specific, well-defined problems, such as machines on a production line that perform repeatable movements [00:13:40]. The distinction between a “machine” and a “robot” lies in autonomy; a robot makes decisions based on non-obvious external factors [00:16:16].
However, challenges remain, as evidenced by discrepancies like Boston Dynamics’ advanced humanoids versus a constantly jamming robotic coffee arm [00:14:40]. This highlights that while complex human-like movements can be achieved, ensuring reliability in simple, repetitive tasks can still be problematic if the design or implementation is flawed [00:15:50].
Future Prospects and Concerns
Professor Cygan expresses more excitement than fear about the development of robotics and AI [00:16:53]. While acknowledging the potential dangers of any technology, he believes the greater threat comes from humans misusing technology rather than robots autonomously taking over [00:11:18].
The technological capabilities have vastly improved over the last two decades due to more powerful computers, a deeper understanding of tools for problem-solving, and the abundance of data from the internet [00:17:38]. The shift from rigid rule-based coding to machine learning, where algorithms learn from experimental data, is a key change [00:18:08].
Professor Cygan was particularly surprised a few years ago when AI technology demonstrated the ability to generate useful code that could solve non-obvious programming issues [00:19:23]. This means a programmer can provide a function specification and receive code support, a capability he once thought was much further off [00:19:36].
The development of models like GPT-4, which can process not only text but also images and sound (multimodal training), suggests even greater possibilities for AI models [00:20:20]. The combination of such advanced models with robotics could allow for high-level control, where a model plans actions based on diverse data inputs, with lower-level modules handling specific sensor interactions [00:20:51].
Regarding future AI development, Professor Cygan anticipates a “flattening” of major technological leaps after the significant jump to GPT-4 [00:24:25]. He expects more focus on making AI models cheaper, more common, and usable on mobile phones, thereby allowing for wider use of the technology rather than just further improvements in its core capabilities [00:24:36].
“We are going in a direction that is unknown how it will turn out.” [00:01:01] “The moment when I was surprised was a few years ago when I saw that this technology is indeed able to generate pieces of code that are useful and are able to solve non-obvious issues.” [00:08:00] “We will reach the point where robots will have senses.” [00:18:00]