Learning Human-to-Humanoid
Real-Time Whole-Body Teleoperation

Abstract

We present Human to Humanoid (H2O), a reinforcement learning (RL) based framework that enables real-time whole-body teleoperation of a full-sized humanoid robot with only an RGB camera. To create a large-scale retargeted motion dataset of human movements for humanoid robots, we propose a scalable ''sim-to-data" process to filter and pick feasible motions using a privileged motion imitator. Afterwards, we train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. We successfully achieve teleoperation of dynamic whole-body motions in real-world scenarios, including walking, back jumping, kicking, turning, waving, pushing, boxing, etc. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.

Left-right / Right-left Ball Kicking

Box Handover

Walking Forward and Jumping Back

Boxing

Step Forward and Punch

Walking with a Stroller / Walking and Rotating

Robustness Test

Human Motion Retargeting

Method

H2O framework

  1. Retargeting: In the first stage, the process aligns the SMPL body model to a humanoid's structure by optimizing shape and motion parameters. The second stage refines this by removing artifacts and infeasible motions using a trained privileged imitation policy, yielding a realistic and cleaned motion dataset for the humanoid.
  2. Sim-to-Real Training: A imitation policy is trained to tracking motion goals sampled from cleaned retargeting dataset.
  3. Real-time Teleoperation Deployment: The real-time teleoperation deployment captures human motion through an RGB camera and a pose estimator, which is then mimicked by a humanoid robot using the trained sim-to-real imitation policy.

Media


BibTeX

@inproceedings{he2024learning,
  author    = {He, Tairan and Luo, Zhengyi and Xiao, Wenli and Zhang, Chong and Kitani, Kris and Liu, Changliu and Shi, Guanya},
  title     = {Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation},
  booktitle = {arXiv},
  year      = {2024},
}