Skip to content

Performance and Cost Assessment of Machine Learning Interatomic Potentials

MetadataDetails
Publication Date2020-01-09
JournalThe Journal of Physical Chemistry A
AuthorsYunxing Zuo, Chi Chen, Xiangguo Li, Zhi Deng, Yiming Chen
InstitutionsSandia National Laboratories, University of Cambridge
Citations839
AnalysisFull AI Review Included

Technical Documentation & Analysis: Machine Learning Interatomic Potentials

Section titled “Technical Documentation & Analysis: Machine Learning Interatomic Potentials”

This research provides a critical, unbiased assessment of five major Machine Learning Interatomic Potentials (ML-IAPs) against Density Functional Theory (DFT) data, offering crucial insights for materials modeling, particularly for Group IV semiconductors like diamond.

  • Superior Accuracy: All ML-IAPs (GAP, MTP, NNP, SNAP, qSNAP) achieved near-DFT accuracy in predicting energies, forces, and material properties, substantially outperforming traditional empirical potentials (EAM, MEAM, Tersoff).
  • Diamond Relevance: The study focused on Group IV semiconductors (Si, Ge), providing a direct computational foundation for modeling the extreme properties of CVD diamond (C).
  • Optimal Performance: The Moment Tensor Potential (MTP) and Gaussian Approximation Potential (GAP) models generally exhibited the lowest Mean Absolute Errors (MAEs) in energy (meV atom-1 scale) and forces (0.1 eV Å-1 scale).
  • Cost-Accuracy Trade-off: MTP, NNP, SNAP, and qSNAP models were found to be two orders of magnitude less computationally expensive than the GAP model for comparable accuracy, highlighting MTP as the most efficient choice.
  • Extrapolability: The simple linear SNAP model showed surprisingly robust performance in extrapolating to unseen polymorphic structures (e.g., wurtzite Si/Ge), a key metric for novel material discovery.
  • Structural Integrity: All ML-IAPs accurately reproduced the Equation of State (EOS) curves for all elements, including diamond systems, within the 2 meV/atom threshold for “indistinguishable EOS.”

The following table summarizes key quantitative data extracted from the ML-IAP optimization and validation process, focusing on parameters relevant to Group IV semiconductors (Si and Ge).

ParameterValueUnitContext
DFT Kinetic Energy Cutoff520eVVASP calculation parameter
DFT Force Convergence0.02eV/ÅAtomic force component convergence threshold
Si Lattice Parameter (DFT)5.469ÅGround state crystal structure
Ge Lattice Parameter (DFT)5.763ÅGround state crystal structure
Si Bulk Modulus (BVRH, DFT)95GPaVoigt-Reuss-Hill approximation
Ge Bulk Modulus (BVRH, DFT)71GPaVoigt-Reuss-Hill approximation
Si Vacancy Formation Energy (DFT)3.25eVCritical input for defect modeling
Ge Vacancy Formation Energy (DFT)2.19eVCritical input for defect modeling
Si Optimized Cutoff Radius (GAP)5.4ÅMaximum range of interatomic interactions
Ge Optimized Cutoff Radius (NNP)5.6ÅMaximum range of interatomic interactions
EOS Accuracy Threshold ($\Delta_{EOS}$)< 2meV/atomThreshold for “indistinguishable EOS” from DFT
Computational Cost ScalingLinears/(MD step $\cdot$ atom)Observed scaling for all ML-IAPs with atom count

The ML-IAPs were trained and validated using a rigorous, multi-step computational workflow designed to ensure diversity and consistency across the data set.

  1. DFT Data Generation: High-throughput Density Functional Theory (DFT) calculations were performed using the Vienna ab initio simulation package (VASP 5.4.1) with the Perdew-Burke-Ernzerhof (PBE) generalized gradient approximation (GGA).
  2. Diverse Structure Sampling: The training data included five categories of atomic configurations for six elements (Li, Mo, Ni, Cu, Si, Ge):
    • Ground-state crystals (bcc, fcc, diamond).
    • Strained structures (±10% strain at 2% intervals).
    • Slab structures (up to Miller index 3, e.g., (100), (110), (111)).
    • NVT ab initio Molecular Dynamics (AIMD) simulations at various temperatures (300 K up to 2.0x melting point).
    • AIMD simulations including single vacancies.
  3. ML-IAP Training: Four major ML-IAPs were evaluated: Gaussian Approximation Potential (GAP), Moment Tensor Potential (MTP), High-dimensional Neural Network Potential (NNP), and Spectral Neighbor Analysis Potential (SNAP/qSNAP).
  4. Optimization Scheme: A two-loop optimization procedure was employed:
    • Inner Loop: Trained the ML model using a 90:10 training/test split against DFT energies, forces, and stresses.
    • Outer Loop: Optimized hyperparameters (e.g., cutoff radius, degrees of freedom) by minimizing the error in predicted bulk material properties (e.g., elastic tensors).
  5. Validation: Performance was assessed based on Mean Absolute Errors (MAEs) in energy and forces, accuracy in predicting material properties (lattice parameters, elastic constants, migration/vacancy energies), and the deviation of the Equation of State (EOS) curves from DFT reference data.

The high-accuracy computational modeling presented in this paper is essential for advancing materials science, particularly for high-performance materials like diamond. 6CCVD provides the physical, high-quality MPCVD diamond substrates necessary to validate and realize the applications derived from these advanced ML-IAPs.

Research Requirement/Finding6CCVD Solution & Value Proposition
Focus on Diamond Group IV Systems (Si, Ge)Applicable Materials: Single Crystal Diamond (SCD) & Polycrystalline Diamond (PCD). Diamond (C) is the ultimate Group IV semiconductor. We supply ultra-pure MPCVD diamond substrates, essential for experimental validation of ML-IAPs targeting extreme thermal, mechanical, and electronic properties.
Validation of Elastic Constants (C11, C12, C44)Precision SCD Wafers: ML-IAPs predict elastic constants with high fidelity. 6CCVD provides SCD wafers with precise crystallographic orientations ((100), (110), (111)) and superior surface quality (Ra < 1nm polishing) required for high-precision mechanical testing.
Modeling of Defect Chemistry (Vacancy/Migration Energy)High-Purity SCD Substrates: Accurate simulation of defect formation (Ev) and migration (Em) requires materials with minimal background impurities. Our SCD material ensures experimental results align with theoretical predictions by controlling nitrogen and other defect concentrations.
Large-Scale MD Simulation ValidationLarge-Area PCD Plates (Up to 125mm): The study confirms the linear scaling of ML-IAPs for large systems (up to 11,664 atoms). For large-scale device integration or thermal management applications, 6CCVD offers PCD plates up to 125mm in diameter and up to 500”m thick.
Custom Surface and Interface Modeling (Slab Structures)Advanced Fabrication & Metalization: The DFT data included complex slab structures. 6CCVD offers precision laser cutting and custom metalization services (Au, Pt, Pd, Ti, W, Cu) to create specific surface terminations and device interfaces for validating ML-IAP predictions on complex structures.
Need for Optimized Potentials for Electronic ApplicationsBoron-Doped Diamond (BDD): For researchers extending ML-IAPs to doped systems (e.g., BDD for electrochemistry or power electronics), 6CCVD provides custom BDD films with controlled doping concentrations and thicknesses (0.1”m to 500”m).
Engineering Support & ConsultationIn-House PhD Team: 6CCVD’s expert material scientists can assist researchers in selecting the optimal diamond material specifications (thickness, doping, orientation) required for similar Atomistic Simulation and Potential Development projects.

For custom specifications or material consultation, visit 6ccvd.com or contact our engineering team directly.

View Original Abstract

Machine learning of the quantitative relationship between local environment descriptors and the potential energy surface of a system of atoms has emerged as a new frontier in the development of interatomic potentials (IAPs). Here, we present a comprehensive evaluation of machine learning IAPs (ML-IAPs) based on four local environment descriptors-atom-centered symmetry functions (ACSF), smooth overlap of atomic positions (SOAP), the spectral neighbor analysis potential (SNAP) bispectrum components, and moment tensors-using a diverse data set generated using high-throughput density functional theory (DFT) calculations. The data set comprising bcc (Li, Mo) and fcc (Cu, Ni) metals and diamond group IV semiconductors (Si, Ge) is chosen to span a range of crystal structures and bonding. All descriptors studied show excellent performance in predicting energies and forces far surpassing that of classical IAPs, as well as predicting properties such as elastic constants and phonon dispersion curves. We observe a general trade-off between accuracy and the degrees of freedom of each model and, consequently, computational cost. We will discuss these trade-offs in the context of model selection for molecular dynamics and other applications.