Sim2Real-AD: A Modular Sim-to-Real Framework for Deploying VLM-Guided Reinforcement Learning in Real-World Autonomous Driving
Summary: arXiv:2604.03497v1 Announce Type: cross
Abstract: Deploying reinforcement learning policies trained in simulation to real autonomous vehicles remains a fundamental challenge, particularly for VLM-guided RL frameworks whose policies are typically learned with simulator-native observations and simulator-coupled action semantics that are unavailable on physical platforms.
Introduction
The transition from simulated environments to real-world applications in autonomous driving has been a significant hurdle in the field of reinforcement learning (RL). Traditional methods often struggle due to discrepancies between simulated and real-world conditions, particularly when using Vision-Language Model (VLM) guided frameworks.
Introducing Sim2Real-AD
This article introduces Sim2Real-AD, a novel modular framework designed for zero-shot sim-to-real transfer of VLM-guided RL policies trained in the CARLA simulator. This framework allows for the deployment of these policies on full-scale vehicles without the need for real-world RL training data.
Framework Components
The Sim2Real-AD framework decomposes the transfer challenge into four essential components:
- Geometric Observation Bridge (GOB): This component converts monocular front-view images into simulator-compatible bird’s-eye-view (BEV) observations, ensuring that the input data is usable in real-world scenarios.
- Physics-Aware Action Mapping (PAM): PAM translates policy outputs into platform-agnostic physical commands, facilitating the execution of actions in real environments.
- Two-Phase Progressive Training (TPT): TPT stabilizes adaptation by separating action-space and observation-space transfer, enhancing the learning process and ensuring smoother transitions.
- Real-time Deployment Pipeline (RDP): This component integrates perception, policy inference, control conversion, and safety monitoring for closed-loop execution, crucial for operating autonomous vehicles safely.
Simulation Experiments and Results
Extensive simulation experiments were conducted to evaluate the effectiveness of the Sim2Real-AD framework. The results demonstrate that the framework preserves the relative performance ordering of various RL algorithms across different reward paradigms. Notably, zero-shot deployment on a full-scale Ford E-Transit achieved remarkable success rates:
- 90% success in car-following scenarios
- 80% success in obstacle avoidance
- 75% success in stop-sign interaction
Conclusion
To the best of our knowledge, this study represents one of the first successful demonstrations of zero-shot closed-loop deployment of a CARLA-trained VLM-guided RL policy on a full-scale real vehicle without any real-world RL training data. The findings underscore the potential of the Sim2Real-AD framework in bridging the gap between simulation and real-world applications in autonomous driving.
For more information, including a demo video and access to the code, please visit the official website: Sim2Real-AD Website.
