EquiformerV3: Scaling Efficient, Expressive, and General SE(3)-Equivariant Graph Attention Transformers
As SE(3)-equivariant graph neural networks mature as a core tool for 3D atomistic modeling, improving their efficiency, expressivity, and physical consistency has become a central challenge for large-scale applications. In this work, we introduce EquiformerV3, the third generation of the SE(3)-equivariant graph attention Transformer, designed to advance all three dimensions: efficiency, expressivity, and generality.
Building on EquiformerV2, we have achieved the following three key advances:
- Optimized Software Implementation: We have successfully optimized the software implementation of EquiformerV3, achieving a remarkable 1.75× speedup compared to its predecessor. This enhancement significantly reduces computational overhead, making the model more accessible for large-scale applications.
- Modifications to EquiformerV2: We introduce simple yet effective modifications that enhance the performance of EquiformerV2. These include:
- Equivariant merged layer normalization
- Improved feedforward network hyper-parameters
- Attention with smooth radius cutoff
- SwiGLU-$S^2$ Activations: To incorporate many-body interactions for better theoretical expressivity, we propose SwiGLU-$S^2$ activations. This innovation preserves strict equivariance while reducing the complexity of sampling S^2 grids. Together with smooth-cutoff attention, these activations enable accurate modeling of smoothly varying potential energy surfaces (PES).
The advancements in EquiformerV3 generalize the model to tasks requiring energy-conserving simulations and higher-order derivatives of PES. The integration of denoising non-equilibrium structures (DeNS) as an auxiliary task during training allows EquiformerV3 to achieve state-of-the-art results on several benchmark datasets, including OC20, OMat24, and Matbench Discovery.
As the field of 3D atomistic modeling continues to evolve, EquiformerV3 stands out as a significant contribution, addressing critical challenges related to efficiency, expressivity, and generality. With its advanced features, it provides researchers with a robust tool for modeling complex molecular interactions and energy landscapes, ultimately paving the way for innovative applications in materials science and computational chemistry.
In conclusion, EquiformerV3 represents a substantial leap forward in the development of SE(3)-equivariant graph neural networks, showcasing the potential to transform how researchers approach large-scale atomistic simulations. The combination of speed enhancements, innovative modifications, and new activation functions positions EquiformerV3 as an invaluable asset for the scientific community.
