Making Room for AI: Multi-GPU Molecular Dynamics with Deep Potentials in GROMACS
In the rapidly evolving field of computational chemistry, GROMACS has established itself as a de facto standard for classical Molecular Dynamics (MD) simulations. However, the emergence of artificial intelligence (AI)-driven interatomic potentials, which strive for near-quantum accuracy while maintaining high MD throughput, presents a formidable challenge. The key question is how to effectively embed neural-network inference into multi-GPU simulations without sacrificing performance.
This article explores the integration of the Machine Learning Interatomic Potentials (MLIP) framework, specifically DeePMD-kit, into GROMACS. This integration facilitates domain-decomposed, GPU-accelerated inference across multi-node systems, paving the way for enhanced molecular dynamics simulations that leverage AI technology.
Key Developments in GROMACS-DeePMD Integration
- Extended NNPot Interface: The GROMACS NNPot interface has been enhanced with a DeePMD backend, allowing for seamless interaction between traditional MD methods and advanced AI-driven potential functions.
- Domain Decomposition Layer: A novel domain decomposition layer has been introduced, which operates independently from the main simulation. This architecture enables more efficient processing and reduces computational bottlenecks.
- Concurrent Inference Execution: The system executes inference concurrently across all processes, utilizing two MPI collectives at each step to broadcast coordinates and aggregate forces. This ensures that the workload is evenly distributed across multiple GPUs.
Model Training and Validation
An in-house Deep Potential model (DPA-1) comprising 1.6 million parameters was trained on a dataset of solvated protein fragments. The initial validation of this implementation was carried out using a small protein system, which confirmed the efficacy of the GROMACS-DeePMD integration.
Benchmarking Performance
The performance of the GROMACS-DeePMD integration was rigorously benchmarked using a larger protein system containing 15,668 atoms. Testing was conducted on NVIDIA A100 and AMD MI250x GPUs, scaling up to 32 devices. The results showcased impressive strong-scaling and weak-scaling efficiencies:
- Strong-Scaling Efficiency: Achieved 66% at 16 devices and 40% at 32 devices.
- Weak-Scaling Efficiency: Reached 80% at 16 devices, with 48% for MI250x and 40% for A100 at 32 devices.
Profiling Insights
Profiling conducted using the ROCm System profiler revealed that over 90% of the wall time was dedicated to DeePMD inference. Conversely, the MPI collectives played a smaller role, contributing to the overall efficiency of the simulation. These insights highlight the potential of AI-enhanced molecular dynamics in optimizing computational resources and improving simulation accuracy.
Conclusion
The integration of the DeePMD-kit into GROMACS marks a significant advancement in molecular dynamics simulations by harnessing the power of AI. As the demand for accuracy in MD simulations continues to rise, this innovative approach not only enhances performance but also opens new avenues for research in computational chemistry. The collaboration between GROMACS and advanced AI frameworks like DeePMD will undoubtedly shape the future of molecular simulations.
