Joint effort by OpenDriveLab at The University of Hong Kong, Huawei Inc. and Shanghai Innovation Institute (SII).
- Highlights
- News
- Benchmark
- System Architecture
- Roadmap
- Getting Started
- Citation
- Contributing
- License
- Related Resources
- WorldEngine is a post-training framework for Physical AI that systematically addresses the long-tail safety-critical data scarcity problem in autonomous driving.
- Data-driven long-tail discovery: Failure-prone scenarios are automatically identified from real-world driving logs by the pre-trained agent itself — no manual design, no synthetic perturbations.
- Photorealistic interactive simulation via 3D Gaussian Splatting (3DGS): Each discovered scenario is reconstructed into a fully controllable, real-time-renderable simulation environment with independent dynamic agent manipulation.
- Behavior-driven scenario generation: Leverages Behavior World Model (BWM) to generalize and synthesize diverse traffic variations from existing long-tail scenarios, expanding sparse safety-critical events into a dense, learnable distribution.
- RL-based post-training on synthesized safety-critical rollouts substantially outperforms scaling pre-training data alone — competitive with a ~10× increase in pre-training data.
- Production-scale validation: Deployed on a mass-produced ADAS platform trained on 80,000+ hours of real-world driving logs, reducing simulated collision rate by up to 45.5% and achieving zero disengagements in a 200 km on-road test.
- [2026/04/09] Official dataset released. See OpenDriveLab/WorldEngine or OpenDriveLab/WorldEngine (ModelScope)
- [2026/04/10] Official code repository established.
We compare different post-training paradigms on the nuPlan dataset, evaluating on both open-loop and closed-loop metrics across common and rare driving scenarios.
Metric notes: Early stage. Stable ckpts and corresponding results coming soon.
- Open-loop PDMS is aligned with NAVSIM v1.1 PDM Score. Common denotes the standard
navtestsplit; Rare denotes thenavtest_failuressubset — failure-prone rare-case scenarios extracted fromnavtest.- Closed-loop Success Rate is defined as the fraction of simulated driving episodes completed without collision or off-road failure.
- Closed-loop PDMS* is the PDM Score obtained via SimEngine closed-loop testing, where the planner interacts with reactive agents in simulation under real-time rendering.
Training notes:
- Rare logs are failure-prone scenarios automatically extracted from
navtrainby the pre-trained agent itself (see Rare Case Extraction).- Common logs are the standard cases in
navtrain.
| Method | Open-loop PDMS ↑ (common) | Open-loop PDMS ↑ (rare) | Closed-loop Success Rate ↑ | Closed-loop PDMS* ↑ |
|---|---|---|---|---|
| Base model | 85.62 | 47.15 | 73.61 | 60.28 |
| Supervised fine-tuning on rare logs | 87.03 | 49.68 | 73.26 | 62.26 |
| Post-training on common logs | 86.15 | 51.49 | 64.58 | 56.66 |
| Post-training on rare logs | 89.29 | 62.56 | 74.31 | 62.55 |
| Post-training on rare synthetic replays | 88.01 | 56.62 | 76.39 | 62.11 |
| Post-training on rare rollouts w/o Behaviour WM | 88.99 | 59.69 | 85.07 | 68.29 |
| Post-training with WorldEngine | 88.95 | 59.83 | 88.89 | 70.12 |
Key findings:
- Post-training on rare logs significantly outperforms supervised fine-tuning (62.56 vs 49.68 open-loop rare PDMS), demonstrating the advantage of reward-guided optimization over imitation.
- Post-training on common logs provides limited benefit and even degrades closed-loop performance (success rate drops from 73.61% to 64.58%), confirming that long-tail event discovery is essential.
- The full WorldEngine pipeline achieves the best closed-loop performance (88.89% success rate, 70.12 PDMS*), a +15.28% absolute improvement in success rate over the base model.
Each pair shows the Base model vs WorldEngine post-trained model on the same rare-case scenario. Left: front-camera rendering; Right: BEV trajectory visualization.
Zero disengagements in 200 km on-road testing on a mass-produced ADAS platform.
WorldEngine consists of two tightly coupled subsystems:
| Module | Function | Core Technology |
|---|---|---|
| SimEngine | Closed-loop simulation with ego & agents | Hydra, Ray, rendering |
| AlgEngine | End-to-end model training & evaluation | MMDetection3D, UniAD/VADv2/HydraMDP |
- Core platform integration (SimEngine + AlgEngine)
- Multi-GPU distributed simulation and training
- Rare case extraction and fine-tuning pipeline
- Comprehensive documentation and usage guides
- Hugging Face / ModelScope dataset
- Open-source release (code, data, early pre-trained models)
- arXiv preprint
- Behavior World Model integration
- Stable pre-trained models
WorldEngine provides comprehensive guides for each stage of your workflow:
| Guide | Purpose | Key Topics |
|---|---|---|
| Installation | Set up both conda environments | Two-environment setup (simengine + algengine), dependencies, troubleshooting |
| Data Organization | Prepare datasets and checkpoints | Data structure, Hugging Face/ModelScope downloads, symlinks |
| Quick Start | Run your first experiment in 5 min | Quick test tutorial, understanding results, complete pipeline |
| SimEngine Usage | Master closed-loop simulation | Rollout scripts, distributed testing, configuration, metrics |
| AlgEngine Usage | Train and fine-tune models | Training from scratch, evaluation, rare case extraction, RL fine-tuning |
WorldEngine requires two separate conda environments due to different Python requirements.
Full installation guide: docs/installation.md
Verify your installation with a pre-trained model:
# Set up environment variable
export WORLDENGINE_ROOT=$(pwd)
# Option 1: Single GPU test
bash scripts/closed_loop_test.sh
# Option 2: Multi-GPU test (Default 8 GPUs)
bash scripts/multigpu_closed_loop_test.shWhat this does:
- Loads a pre-trained VADv2 model (50% training data, epoch 8)
- Runs closed-loop simulation on 288 rare-case test scenarios
- Evaluates with navsim v1 PDMS (collision avoidance, progress, comfort, etc.)
- Saves results to
experiments/closed_loop_exps/e2e_vadv2_50pct/navtest_failures_NR/
Detailed quick start tutorial: docs/quick_start.md
After the quick test, explore each subsystem in detail:
Learn how to run simulations, generate rollouts, and test models:
- Rollout scripts for data generation (no model required)
- Testing scripts for model evaluation (single/multi-GPU)
- Ray distributed simulation for large-scale testing
- Reactive vs non-reactive agent modes
- Configuration guide for all Hydra parameters
Learn how to train models, extract rare cases, and fine-tune:
- Training from scratch
- Open-loop evaluation on test sets
- Rare case extraction from evaluation failures
- RL-based fine-tuning on long-tail scenarios
- Multi-GPU training with distributed data parallel
WorldEngine's simulation environments are powered by 3D Gaussian Splatting (MTGS):
- Multi-traversal reconstruction from nuPlan data
- Photorealistic rendering for closed-loop simulation
- Asset generation for SimEngine scenes
If any parts of our work help your research, please consider citing us and giving a star to our repository:
If you use the Render Assets (MTGS), please also cite:
@article{li2025mtgs,
title={MTGS: Multi-Traversal Gaussian Splatting},
author={Li, Tianyu and Qiu, Yihang and Wu, Zhenhua and Lindstr{\"o}m, Carl and Su, Peng and Nie{\ss}ner, Matthias and Li, Hongyang},
journal={arXiv preprint arXiv:2503.12552},
year={2025}
}If you use the augmented scenarios data, please cite as well:
@inproceedings{zhou2025nexus,
title={Decoupled Diffusion Sparks Adaptive Scene Generation},
author={Zhou, Yunsong and Ye, Naisheng and Ljungbergh, William and Li, Tianyu and Yang, Jiazhi and Yang, Zetong and Zhu, Hongzi and Petersson, Christoffer and Li, Hongyang},
booktitle={ICCV},
year={2025}
}@article{li2025optimization,
title={Optimization-Guided Diffusion for Interactive Scene Generation},
author={Li, Shihao and Ye, Naisheng and Li, Tianyu and Chitta, Kashyap and An, Tuo and Su, Peng and Wang, Boyang and Liu, Haiou and Lv, Chen and Li, Hongyang},
journal={arXiv preprint arXiv:2512.07661},
year={2025}
}If you find AlgEngine well, please cite as well:
@ARTICLE{11353028,
author={Liu, Haochen and Li, Tianyu and Yang, Haohan and Chen, Li and Wang, Caojun and Guo, Ke and Tian, Haochen and Li, Hongchen and Li, Hongyang and Lv, Chen},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Reinforced Refinement With Self-Aware Expansion for End-to-End Autonomous Driving},
year={2026},
volume={48},
number={5},
pages={5774-5792},
keywords={Adaptation models;Self-aware;Autonomous vehicles;Pipelines;Planning;Training;Reinforcement learning;Uncertainty;Data models;Safety;End-to-end autonomous driving;reinforced finetuning;imitation learning;motion planning},
doi={10.1109/TPAMI.2026.3653866}}If you find data scaling infos helpful, please also cite:
@article{tian2025simscale,
title={SimScale: Learning to Drive via Real-World Simulation at Scale},
author={Haochen Tian and Tianyu Li and Haochen Liu and Jiazhi Yang and Yihang Qiu and Guang Li and Junli Wang and Yinfeng Gao and Zhang Zhang and Liang Wang and Hangjun Ye and Tieniu Tan and Long Chen and Hongyang Li},
journal={arXiv preprint arXiv:2511.23369},
year={2025}
}We welcome contributions from the community! Whether you want to:
- Report bugs - Open an Issue
- Improve documentation - Submit a Pull Request
- Contribute code - Fork, develop, and submit a PR
Please read our contributing guidelines before submitting PRs.
For questions:
- Check the documentation first
- Search existing Issues
All content in this repository is under the Apache-2.0 license.
The released data is based on nuPlan and is under the CC-BY-NC-SA 4.0 license.
We acknowledge all the open-source contributors for the following projects to make this work possible:
If you find WorldEngine useful, please consider giving us a star!
Quick Links: Documentation | Installation | Quick Start | Issues | Discussions
Contact: For research collaboration or questions, visit our Discussions








