Large-scale deep learning molecular dynamics simulations with  ab initio accuracy on supercomputer Fugaku

Lijun Liu; Zhouqiang Guo; Weile Jia

ICCM Conferences, The 14th International Conference of Computational Methods (ICCM2023)

Large-scale deep learning molecular dynamics simulations with ab initio accuracy on supercomputer Fugaku

Lijun Liu, Zhouqiang Guo, Weile Jia

Last modified: 2023-07-09

Abstract

Ab initio calculations and molecular dynamics (MD) are widely used approaches for understanding physical phenomena and chemical reactions at the atomic level. Because the behavior of atoms is investigated in ab initio calculations based on the state of electrons, although accurate analysis of various substances is possible, the size of a realistic model is still about several hundred atoms. Conventional MD can treat larger systems compared to ab initio calculations; however, the accuracy is insufficient for the heterogeneous internal structure and inhomogeneous deformation around the defects in materials.

In recent years, deep learning molecular dynamics in which interatomic potential is calculated by using deep learning have attracted tremendous attention [1]. Even if the function shape of the potential is unknown, it still can be fitted to the deep learning model, so it is expected to be highly versatile and maintain ab initio accuracy. By utilizing supercomputers, deep neural network models trained from data generated with first-principles calculations for computing interatomic potential in molecular dynamics can even be extended applications in terms of spatial and temporal scales. The current state-of-art can achieve 1 − 2 nanoseconds molecular dynamics simulation per day for 100 million atoms on the Summit supercomputer with ab initio accuracy [2].

In this study, by applying both algorithmic and system innovations, the memory footprint and computational time have been significantly reduced. We compress the neural network by approaches including model tabulation, kernel fusion, and redundancy removal. We then accelerate the customized kernel and optimize the tabulation of the activation function. Hybrid MPI and OpenMP parallelization are implemented on GPU and ARM architectures. After the application of our approaches, the optimized code can scale up to the entire machine of Fugaku, and the corresponding system size can be extended to billions of atoms.

An account with this site is required in order to view papers. Click here to create an account.