RoSHI: A Versatile Robot-oriented Suit for Human Data In-the-Wild
Summary: arXiv:2604.07331v1 Announce Type: cross
Abstract: Scaling up robot learning will likely require human data containing rich and long-horizon interactions in the wild. Existing approaches for collecting such data trade off portability, robustness to occlusion, and global consistency. We introduce RoSHI, a hybrid wearable that fuses low-cost sparse IMUs with the Project Aria glasses to estimate the full 3D pose and body shape of the wearer in a metric global coordinate frame from egocentric perception.
This innovative system is motivated by the complementarity of the two sensors: IMUs provide robustness to occlusions and high-speed motions, while egocentric SLAM anchors long-horizon motion and stabilizes upper body pose. Through the integration of these technologies, RoSHI aims to enhance the quality of data collected for robotic learning, particularly in dynamic and unpredictable environments.
Key Features of RoSHI
- Hybrid Wearable Technology: Combines low-cost sparse Inertial Measurement Units (IMUs) with Project Aria glasses.
- 3D Pose Estimation: Accurately estimates full 3D pose and body shape in a global coordinate frame.
- Robustness: Provides enhanced performance in scenarios with occlusions and rapid movement.
- Long-Horizon Motion Tracking: Utilizes egocentric SLAM for maintaining stability and accuracy over extended periods.
Dataset and Performance
To evaluate the capabilities of RoSHI, a comprehensive dataset of agile activities was collected. This dataset serves as a benchmark for measuring the effectiveness of the system in capturing intricate human movements.
Initial results indicate that RoSHI generally outperforms other existing egocentric baselines and performs comparably to a leading exocentric baseline known as SAM3D. The precision and reliability of the motion data collected through RoSHI prove to be suitable for real-world humanoid policy learning, making it a significant advancement in the field of robotic learning.
Applications and Future Work
The implications of RoSHI extend beyond simple data collection. This system can potentially revolutionize how robots learn from human interactions, paving the way for more sophisticated and adaptable robotic systems.
- Human-Robot Interaction: Enhances the ability of robots to learn from natural human movements.
- Robotic Policy Learning: Facilitates the development of humanoid policies based on real-world data.
- Agile Robotics: Supports the training of robots for dynamic environments, improving their adaptability.
For videos, data, and more information about RoSHI, please visit the project webpage: https://roshi-mocap.github.io/.
