# Error estimation in molecular dynamics simulations: nitrogen impurities in iron with machine learning potentials

# Error estimation in molecular dynamics simulations: nitrogen impurities in iron with machine learning potentials

Promotor(en):**18MODEV05**/ Solid-state physics, Model and software development**T. Verstraelen, S. Cottenier**/As electrons in matter obey the laws of quantum mechanics, one should in principle solve the Schrödinger equation when performing molecular simulations. On the nanoscale (tracking thousands of atoms during multiple nanoseconds) this is completely unfeasible. Even for approximate electronic structure methods, such as Density Functional Theory (DFT), the computational burden is too high to study phenomena such as diffusion of defects in solids. Empirical potentials can resolve this issue, because atomic forces are calculated with very simple analytical expressions. An example is the venerable embedded atom model, which was originally introduced to study hydrogen impurities in nickel [Daw1983] but was later extended for many other types of materials. The major disadvantage of empirical potentials is the loss of accuracy compared to electronic structure methods. Indeed, determining appropriate analytic expressions and corresponding parameters is a daunting task. This has led to a situation where force fields are often deemed unreliable, as one cannot anticipate to what extent a force field is able to correctly predict certain properties for the molecular systems at hand. Recent breakthroughs in machine learning enable to surmount the aforementioned reliability issues. With plenty of ab-initio data available, one can train a machine learning force field to accurately reproduce the potential energy surface. Furthermore, energy and force predictions for newly encountered atomic configurations can be computed relatively cheaply compared to electronic structure methods.

It is a challenge to obtain reliable force fields for iron based metals, in particular for the development of new steels by adding nitrogen. In conventional steel processing, nitrogen alloying is difficult due to its limited solubility during casting and solidification. Alternatively, nitriding as thermochemical treatment on the final material can be used to significantly improve surface and bulk properties. By studying the Fe-N metallurgy on the atomic scale, the aim is to gain a good understanding of the mechanisms of nitrogen diffusion and precipitation. To study the latter phenomenon, system sizes of thousands of atoms are required. Therefore, the development of an accurate empirical potential can provide a major leap forward in modeling the Fe-N system.

*Figure 1: (left) Fe16N2 Is a tetragonally distorted iron structure, where the nitrogen atoms form a sublattice on the interstitial sites. (right) Precipitation requires the diffusion of atoms to a local region with a different crystal structure and/or composition. This region is typically composed of many thousands of atoms.*In recent years, quantifying uncertainty in simulation predictions has attracted the attention of several scientific communities. The Technology Roadmap for Computational Chemistry (http://energy.gov/eere/amo/downloads/itp-chemicals-technology-roadmap-co...) ranked as most critical the “lack of methods to estimate the intrinsic accuracy of calculations, leading to a critical barrier to the more widespread use of these methods for solving practical engineering problems.” For example, in this thesis it would be desirable to estimate the error on the simulated nitrogen diffusion constant, due to the approximations in the force field.

Despite the fact that the necessity for uncertainty management in constructing force-field models has been expressed some years ago, the topic has received very little attention so far. If error estimation of force fields is mastered, new techniques like ‘on the fly learning‘ become available: based on the error of the force field, it is decided whether or not one applies the cheap force field methods or falls back to the more accurate but computationally demanding ab-initio calculations.

**Objectives**The goal of this thesis is to construct a new force field potential with controlled accuracy for the Fe-N system, and to apply them to study the diffusion and precipitation of nitrogen in iron and the geometry of nitrogen precipitates. (See Figure 1.) In recent literature, machine learning methods as kernel regression or Gaussian Approximation Potentials (GAP) have been successfully applied for metallic structures. Using the translational and rotational invariant (to conserve translational and angular momentum) Smooth Overlap of Atomic Positions (SOAP) atomic descriptor [Bartók2013], one can construct a new Fe-N force field if accurate ab-initio training data is available. The latter can be easily computed at the High Performance Computing (HPC) infrastructure of Ghent University for various structures. As a validation for the effectiveness and accuracy the newly developed machine learning force field, one can compare it to an empirical EAM force field. For the calibration of parameters in these empirical models, ample reference data is available in the frame of ongoing research at the Center for Molecular Modeling. Good starting points can be found in the literature, e.g. an EAM parameterization for Fe-P [Ackland2004] and the calibration of the parameters will in first instance be carried out with established methods, such as PotFit [https://www.potfit.net/].

One major difficulty with the standard methods is that they try to find a single optimal set to describe the training data, which may be misleading. A good performance on the training set does not imply reliable results in applications afterwards. Bayesian inference is ideally suited to estimate errors (due to overfitting). [Federiksen2004] In this thesis, an extension of the Bayesian approach will be used, which was recently developed at the Center for Molecular Modeling. Furthermore, this approach enables the inclusion of errors due to uncertainties in the reference data or approximations in the empirical potentials and machine learning methods being used.

The end result of the thesis would consist of relevant theoretical predictions with error estimates for nitrogen impurities in steel. For a fixed set of parameters in the force field potential, diffusion constants can be computed using the Einstein relation (mean squared displacement versus time). By repeating such simulations with different "reasonable" parameter vectors, one obtains a distribution of predictions, to which standard statistics can be applied. A more challenging question is the structure of nitrogen precipitates. These domains are only a few atomic layers thick but several micrometers long. The structure of these domains at their boundaries is unclear and can be investigated with molecular dynamics simulations as well.

**Mobility**Mobility is not required for this thesis. The topic is related to a collaboration between the center for molecular modeling and OCAS, which are all Ghent-based.

**Motivation Appl. Phys.**This thesis requires physical insight because a good understanding of interatomic interactions is necessary to construct physically sensible force field models. The final results of the thesis are used in the development of new types of steel, which is clearly an engineering problem.

- Study programmeMaster of Science in Engineering Physics [EMPHYS], Master of Science in Physics and Astronomy [CMFYST]ClustersFor Engineering Physics students, this thesis is closely related to the cluster(s) MODELING, MATERIALS, NANOKeywordsempirical potential, materials, error modeling, Molecular simulation, machine learningReferences
[Daw1983] M. S. Daw & M. I. Baskes. Phys. Rev. Lett. 50, 1285–1288 (1983). http://dx.doi.org/10.1103/PhysRevLett.50.1285

[Bartók2013] A. P. Bartók, R. Kondor and G. Csanyi, “On representing chemical environments,” Phys. Rev. B, 87, 184115 (2013) https://doi.org/10.1103/PhysRevB.87.184115

[Ackland2004] G.J. Ackland, M.I. Mendelev, D.J. Srolovitz, S. Han and A.V. Barashev, "Development of an interatomic potential for phosphorus impurities in alpha-iron," J. Phys.: Condens. Matter 16, S2629-S2642 (2004). http://dx.doi.org/10.1088/0953-8984/16/27/003

[Frederiksen2004] Frederiksen, S. L., Jacobsen, K. W., Brown, K. S., & Sethna, J. P. (2004). Bayesian Ensemble Approach to Error Estimation of Interatomic Potentials. Phys. Rev. Lett., 93, 165501. http://dx.doi.org/10.1103/physrevlett.93.165501