OccSim: Multi-kilometer Simulation with Long-horizon Occupancy World Models
Summary: arXiv:2603.28887v1 Announce Type: cross
Data-driven autonomous driving simulation has faced significant challenges due to its heavy reliance on pre-recorded driving logs or spatial priors, such as high-definition (HD) maps. This fundamental dependency imposes limitations on scalability and constrains the open-ended generation capabilities to the finite scale of existing collected datasets. To overcome these hurdles, researchers have introduced OccSim, a groundbreaking occupancy world model-driven 3D simulator.
OccSim represents a paradigm shift in autonomous driving simulation as it eliminates the necessity for continuous driving logs or HD maps. Instead, it operates based solely on a single initial frame and a sequence of future ego-actions. This innovative approach allows OccSim to stably generate over 3,000 continuous frames, facilitating the continuous construction of large-scale 3D occupancy maps that span over 4 kilometers for simulation purposes. This advancement signifies an impressive improvement of over 80 times in stable generation length when compared to previous state-of-the-art occupancy world models.
Key Features of OccSim
OccSim’s capabilities are powered by two essential modules:
- W-DiT Based Static Occupancy World Model: This module is responsible for the ultra-long-horizon generation of static environments. It achieves this by explicitly incorporating known rigid transformations into the architecture design.
- Layout Generator: This component populates the dynamic foreground with reactive agents based on the synthesized road topology, ensuring that the generated environments are not only vast but also interactive.
Through the combination of these two modules, OccSim can synthesize vast and diverse simulation streams that are crucial for the development of autonomous driving technologies. The experiments conducted demonstrate the downstream utility of the data collected directly from OccSim. Notably, the data can be utilized to pre-train 4D semantic occupancy forecasting models, achieving impressive results. The zero-shot performance on unseen data can reach up to 67%, outperforming previous asset-based simulators by 11%.
Scaling and Performance Improvements
When the OccSim dataset is scaled to five times its original size, the zero-shot performance sees a remarkable increase to approximately 74%. Furthermore, the improvement over asset-based simulators expands to an impressive 22.1%. These results underscore the effectiveness and efficiency of OccSim in generating realistic and extensive simulation environments for autonomous driving applications.
In conclusion, OccSim marks a significant advancement in the field of autonomous driving simulation. By breaking free from the constraints of traditional methods and leveraging innovative occupancy world models, it not only enhances the scalability of simulation processes but also improves the accuracy and performance of autonomous driving systems. As research in this domain continues to evolve, tools like OccSim will play a pivotal role in shaping the future of self-driving technology.
