We argued some time ago that embedded AI is the next big thing, hardly a unique insight, of course. Here, we're going to examine the opportunities for Nvidia to become a dominant player in this area, as its AI capabilities should translate well into embodied AI.
Embodied AI
Embodied AI enables self-learning humanoid robots to independently analyze data, make decisions, and adapt to changing circumstances in real time, right at the edge, that is, or point of action, even without having to have cloud connectivity. It's the result of progress on multiple fronts:
Advanced Sensory Systems with systems like LiDARs and Cameras, combined with multi-modal vision systems, are crucial for navigation, spatial awareness, and object recognition. Tactile and Force Sensors provide fine manipulation capabilities by detecting force, pressure, and texture, enabling robots to handle delicate objects and interact safely with humans and their surroundings. Then there are inertial measurement units and encoders, which are essential for balance, posture control, and precise motor feedback for coordinated movements.
High-Performance AI Processing Hardware, like powerful AI chips, to manage perception, decision-making, and motion control in real time.
Robust AI Algorithms and Software powering deep learning for perception and decision-making, motion planning and control and simulations and Digital Twins using high-fidelity simulation platforms to develop, test, and refine algorithms before real-world deployment, reducing development cycles and improving safety. Lastly, reinforcement and imitation learning, which enables humanoid robots to rapidly adapt and learn from demonstrations or trial-and-error to improve tasks like manipulation and navigation autonomously.
Actuators and Robotics hardware: high-precision, lightweight actuators capable of replicating human-like movement and strength, including dexterous hands with many degrees of freedom.
Integrated systems for autonomy and human interaction producing multimodal interaction capabilities combining vision, speech, gesture recognition, and emotional detection to enable natural and safe human-robot collaboration.
Scalability and cost reduction technologies, reducing the size and cost of critical components, including sensors and AI chips, facilitate broader deployment beyond labs to industrial, commercial, and service environments.
Hardware is improving rapidly
One continuously receding limitation is the hardware that produces the fine motor skills robots need to manipulate objects, and what could be more sensitive than prostate operations?
It turns out that robot-assisted operations are more precise. While they do not increase life expectancy, they do significantly reduce the quality of life (incontinence, sexual health) side effects that tend to plague non-robot-assisted operations.
Still, this is a robot-assisted, not yet fully autonomous system. Robots are already very agile and able to react quickly to rapidly changing circumstances and strategize, as this table-tennis-playing robot demonstrates (or boxing robots).
These are still a little less than fluid, but getting there (they are not like this, which is, of course, fake, although the fact that the video was created with AI says something about its capabilities). In theory, robots have multiple advantages:
They have much better senses, and can have additional ones (like seeing infrared light or, using echolocation like bats and dolphins, radar or ultrasonic sensors, night vision, 360-degree vision, etc.). Or consider the “triboelectric effect,” which has been used for an artificial finger (source in in German) to recognize the material touched with 90 percent accuracy.
They are multi-talented. For instance, driven by LLMs, they can operate in almost any language and have expert knowledge in many fields.
They won't get sick, tired, demotivated, need a pay rise or a holiday, and can work 22 hours a day (depending on recharging energy needs, which is also improving; the Chinese even developed a robot that swaps its own batteries). They don't get lost in thought, daydream, get distracted, or lose concentration, have a bad day, get angry at fellow employees, or engage in office politics.
They can work in hazardous or unpleasant circumstances (which also makes them prime candidates for war, but that's another topic).
They will also become much smarter and learn exponentially, see below.
Intelligence
The simple fact is that we already have AI-driven systems that can perform highly complex tasks like driving cars in traffic, and do that better than most of us (and still getting better).
Simulation training can cheaply prepare them for situations that are very rare, but the bulk of the capabilities they have acquired by ingesting endless hours of videos from actual driving cars.
Most tasks we perform in the workplace are much simpler than prostate surgery or driving cars in traffic, and these are already well within the capabilities of robots, which have already been used on the assembly line for decades, but with AI-driven robots, that's about to get to a whole different level.
Robots don't forget like we do, and while our knowledge is lost to the world when we die, robots' knowledge is simply transferred at the end of their economic life.
Also, any knowledge of any robot can, in principle, be transferred to another one (there are obstacles to this, but foundation models like Nvidia's GR00T are a significant step forward here).
Multiple ways of learning
Multiple ways of learning can be employed to trigger learning in robots:
Humans wearing suits that collect data on their movements, with the data transferred to a robot's neural network, which is used to train the robot (for instance, to dance or perform tasks), with other robots simply copying these.
Imitation learning: Here is a host of videos with robots learning through shadowing human behavior. A more efficient way is through AI-generated simulations (which today can be generated with a few lines of text), which have the added bonus of being able to generate situations that are difficult to capture on video.
Reinforcement learning. In 2020, well before the commercial emergence of LLMs, robots were learning to walk by themselves through reinforcement learning, or playing Go (with AlphaGo beating the world champion already in 2019) without ever having been trained to play, or playing soccer, cooperating with teammates.
Learning from other robots. While our knowledge degrades and disappears after death, robot knowledge only accumulates and can be transferred to other robots (especially with the help of foundation models like Nvidia's GR00T).
The ultimate goal: self-learning robots
When robots can not only execute tasks autonomously but improve without human intervention, we're into the realm of self-learning robots:
are machines that can learn and adapt on their own without being programmed for every scenario. These robots use Artificial Intelligence (AI) and machine learning algorithms to teach themselves new skills through trial and error – essentially mimicking the way humans and animals learn. Rather than relying on pre-programmed rules, Self-Learning robots develop their own algorithms by detecting patterns in huge amounts of data. They learn from their experiences, interactions, and the environment, which is why the more they operate, the smarter they get.
In short, it's what you get when you combine AI and robotics. This works through reinforcement learning:
These robots start out with a basic set of algorithms and little knowledge about the world. As they interact with their environment, their AI systems track what happens and use reinforcement learning to determine which actions lead to the best outcomes. The robots then repeat those actions and continue improving over time through practice and repetition (And patience that’s far more than what any human could muster).
Self-learning robots can:
Adapt to changes in their environment.
Produce emergent behavior.
Become better autonomously through iteration.
This goes way beyond programmed robots; complex adaptive behaviour is simply impossible to program in. Instead, they are given goals and objectives, and they develop their own path toward achieving them through self-learning.
So these AI-driven robots are capable of much more complex tasks, improve, and adapt to their environment all by themselves. Given the rapid improvement in AI, this will start to spill over into the field of robotics.
Robots learning from each other
People at Berkeley developed RoVi-Aug, a framework that enables robots to transfer skills between models without human guidance:
The inspiration behind RoVi-Aug stems from advances in machine learning, especially in generative models that excel at generalization. UC Berkeley researchers aimed to replicate this adaptability in robotics, making it easier for robots to function in unpredictable settings. With RoVi-Aug, the dream of robots learning on the fly without human input is becoming a reality. It’s not just a technical win — it’s a glimpse into a future where robots truly work smarter, not harder.
Then there is collective learning through data sharing, and new algorithms were developed to enable these capabilities:
In fact, because data acquisition is often the bottleneck to effective and efficient learning, multi-robot systems that collaboratively leverage the data collected by many robots benefit from a multiplicative scaling on the rate at which they can learn.
There are others on similar paths, enabling robots to share experiences that speed up their learning, at Google:
However, robots can instantaneously transmit their experience to other robots over the network - sometimes known as "cloud robotics" - and it is this ability that can let them learn from each other.
There are several enablers of these robot-to-robot knowledge transfers:
Synthetic data, which reduces reliance on human-labeled datasets.
Decentralized algorithms enable robots to share insights without raw data transmission, preserving privacy.
Foundation models like GR00T N1 and similar architectures generalize skills across tasks and embodiments.
Cost
$100K for a humanoid robot might seem a lot, but it really isn't if one considers all the costs of human employees. In fact, given the inherent advantages in terms of hours, rapidly increasing capabilities, a robot costing $100K would already be cheaper than most workers (provided they could be perfect substitutes).
They can work 4x the hours with perfect concentration and consistency, undistracted. They quickly become cheaper than humans, who also incur a host of additional costs (recruitment, training, supervision, administration, pensions and healthcare contributions, canteens, parking spaces, etc.).
We stress again that we have AI systems that are better drivers than almost all humans already, with many jobs being less complex than driving a car in busy traffic, although humanoid robots are more complex than cars.
Given the rapid improvement, the area where robots will be more economical than humans will keep increasing:
Because we have already come so far with autonomous driving, humanoids will soon be working in factories, then in service and finally in our homes.
Limitations
That remains to be seen, as there are still some important limitations
AI and Learning Limitations
AI algorithms for robots still struggle to adapt in unpredictable, unstructured environments and require vast training data that is costly to obtain and label.
Most models lack generalization: they excel at specific tasks but fail when faced with new challenges or environments where conditions change dynamically.
Dependence on supervised learning and human-labeled data slows real-time adaptation and learning; reinforcement learning and meta-learning approaches are promising but immature for robust deployment.
Current AI systems have issues with common sense, context understanding, and creative problem-solving outside narrow domains.
Hardware and Control
Precision in motion control, dexterity, and fine manipulation are work in progress; tasks like grasping or interacting with human-scale objects are still challenging.
Sensor integration and fusion for real-time perception in cluttered or dynamic environments are technical hurdles, especially for fine navigation and manipulation.
Mechanical design challenges include power efficiency, robust actuation systems, and cost-effective production of reliable humanoid bodies.
Safety and Human Interaction
Safety protocols and autonomous error handling are essential; incidents can undermine public trust and commercial adoption.
Humanoid robots struggle to interpret non-verbal human cues, emotional context, and ambiguity, limiting their effectiveness in socially interactive or care roles.
Building explainable, transparent systems that earn public trust is necessary for real-world deployment.
Conclusion
Huge progress is being made on multiple fronts in the creation of humanoid robots that can autonomously perform a wide variety of tasks, improve performance, and learn new tasks.
But multiple limitations limit the role of AI-driven robots to more defined tasks and situations. That helps, as it triggers a cycle of mass adoption, bringing costs down and generating funds to finance further research into improvements.
One area of improvement is the emergence of foundation models, where Nvidia is already an important player, which we'll discuss in an upcoming article.