AutoWorld: Scaling Multi-Agent Traffic Simulation with Self-Supervised World Models
Summary: arXiv:2603.28963v1 Announce Type: cross
Abstract
Multi-agent traffic simulation is central to developing and testing autonomous driving systems. Recent data-driven simulators have achieved promising results, but rely heavily on supervised learning from labeled trajectories or semantic annotations, making it costly to scale their performance. Meanwhile, large amounts of unlabeled sensor data can be collected at scale but remain largely unused by existing traffic simulation frameworks. This raises a key question: How can a method harness unlabeled data to improve traffic simulation performance?
Introduction
In the pursuit of more efficient and realistic traffic simulations, researchers have turned their attention to leveraging vast amounts of unlabeled data. The traditional reliance on supervised learning, which necessitates extensive labeling of data, presents significant challenges, particularly in terms of scalability and cost. The introduction of AutoWorld marks a significant step forward in addressing these challenges.
AutoWorld Framework
AutoWorld is a novel traffic simulation framework designed to utilize a world model learned from unlabeled occupancy representations derived from LiDAR data. The framework incorporates several key components:
- World Model: AutoWorld constructs a predictive scene context from samples of the learned world model. This context serves as the foundation for the subsequent motion generation process.
- Multi-Agent Motion Generation: The framework utilizes a multi-agent model to generate realistic motion trajectories based on the constructed scene context.
- Cascaded Determinantal Point Process: To enhance sample diversity, AutoWorld employs a cascaded Determinantal Point Process framework, guiding the sampling processes for both the world model and the motion model.
- Motion-Aware Latent Supervision: A unique objective designed to enhance the representation of scene dynamics, ensuring that the generated simulations are not only realistic but also dynamic.
Experimental Results
Experiments conducted on the WOSAC benchmark demonstrate the effectiveness of AutoWorld. The framework secured the top position on the leaderboard, as measured by the primary Realism Meta Metric (RMM). Key findings from the experiments include:
- The integration of unlabeled LiDAR data significantly enhances simulation performance.
- Ablation studies reveal the contribution of each component, confirming the importance of both the world model and the motion-aware supervision in improving realism.
Future Directions
AutoWorld sets the stage for future advancements in traffic simulation realism without the burden of additional labeling. By harnessing the power of unlabeled data, the framework opens new avenues for research and development in autonomous driving technology. With ongoing improvements and refinements, AutoWorld is poised to become a cornerstone of multi-agent traffic simulation.
Conclusion
The introduction of AutoWorld highlights a pivotal shift in the approach to traffic simulation. By leveraging unlabeled data and innovative modeling techniques, this framework not only enhances the realism of simulations but also reduces the costs associated with data labeling. As the field of autonomous driving continues to evolve, AutoWorld represents a significant leap forward in the pursuit of safer and more efficient transportation systems.
