T. Verstraelen

The Tale of HORTON: Lessons Learned in a Decade of Scientific Software Development

M. Chan, T. Verstraelen, A. Tehrani, M. Richer, X. D. Yang, T. D. Kim, E. Vohringer-Martinez, F. Heidar-Zadeh, P. W. Ayers

Journal of Chemical Physics

Volume 160, Issue 16

2024

A1

Abstract

HORTON is a free and open-source electronic-structure package written primarily in Python 3 with some underlying C++ components. While HORTON’s development has been mainly directed by the research interests of its leading contributing groups, it is designed to be easily modified, extended, and used by other developers of quantum chemistry methods or post-processing techniques. Most importantly, HORTON adheres to modern principles of software development, including modularity, readability, flexibility, comprehensive documentation, automatic testing, version control, and quality-assurance protocols. This article explains how the principles and structure of HORTON have evolved since we started developing it more than a decade ago. We review the features and functionality of the latest HORTON release (version 2.3) and discuss how HORTON is evolving to support electronic structure theory research for the next decade. Keywords: quantum chemistry software, computational chemistry, Hartree-Fock method, model hamiltonians, Density Functional Theory (DFT) methods, numerical integration grids, periodic boundary conditions, Gaussian integrals, atoms-inmolecules partitioning schemes, Hirshfeld partitioning, population analysis, electrostatic potential fitting, parsing and converting computational chemistry file formats, theoretical chemistry Python library

DOI

http://dx.doi.org/10.1063/5.0196638

Managing Expectations and Imbalanced Training Data in Reactive Force Field Development: An Application to Water Adsorption on Alumina

L. Dumortier, C. Chizallet, B. Creton, T. De Bruin, T. Verstraelen

Journal of Chemical Theory and Computation (JCTC)

2024

A1

Abstract

ReaxFF is a computationally efficient model for reactive molecular dynamics simulations that has been applied to a wide variety of chemical systems. When ReaxFF parameters are not yet available for a chemistry of interest, they must be (re)optimized, for which one defines a set of training data that the new ReaxFF parameters should reproduce. ReaxFF training sets typically contain diverse properties with different units, some of which are more abundant (by orders of magnitude) than others. To find the best parameters, one conventionally minimizes a weighted sum of squared errors over all of the data in the training set. One of the challenges in such numerical optimizations is to assign weights so that the optimized parameters represent a good compromise among all the requirements defined in the training set. This work introduces a new loss function, called Balanced Loss, and a workflow that replaces weight assignment with a more manageable procedure. The training data are divided into categories with corresponding “tolerances”, i.e., acceptable root-mean-square errors for the categories, which define the expectations for the optimized ReaxFF parameters. Through the Log-Sum-Exp form of Balanced Loss, the parameter optimization is also a validation of one’s expectations, providing meaningful feedback that can be used to reconfigure the tolerances if needed. The new methodology is demonstrated with a nontrivial parametrization of ReaxFF for water adsorption on alumina. This results in a new force field that reproduces both the rare and frequent properties of a validation set not used for training. We also demonstrate the robustness of the new force field with a molecular dynamics simulation of water desorption from a γ-Al₂O₃ slab model.

This publication is licensed under the terms of your institutional subscription. Request reuse permissions.

DOI

http://dx.doi.org/10.1021/acs.jctc.3c01009

Simple Molecular Model for Hydrate Silicate Ionic Liquids, a Realistic Zeolite Precursor

The HSIL liquid structure strongly depends on the charge density and concentration of the alkali cation, ranging from a glassy network with fast ion exchange to an aqueous solution containing long-lasting solvated ion pairs.

J. Vekeman, D. Vandenabeele, N. Doppelhammer, E. Vandeurzen, E. Breynaert, C.E.A. Kirschhock, T. Verstraelen

Chemistry of Materials

36, 8, 3886-3897

2024

A1

Abstract

Despite the widespread use of zeolites in the chemical industry, their formation process is not fully understood due to the complex and heterogeneous structure of traditional synthesis media. Hydrated silicate ionic liquids (HSILs) have been proposed as an alternative. They are truly homogeneous and transparent mixtures with a low viscosity, facilitating experimental characterization. Interestingly, their homogeneous nature and simple speciation bring realistic molecular models of a zeolite growth liquid within reach for the first time. In this work, a simple molecular model is developed that gives insight into the crucial role of the alkali cations (sodium, potassium, rubidium, and cesium). Thereby, molecular dynamics simulations are combined with experimental measurements to demonstrate that the HSIL liquid structure strongly depends on the charge density and concentration of the alkali cation. As the water content increases, it transitions from a glassy network with fast ion exchange to an aqueous solution containing long-lasting solvated ion pairs. Furthermore, simulations reveal that the cation is capable of bringing several silicate monomers together in a glassy network, displaying perfect orientations for condensation reactions that underlie zeolite formation. This work is an important step toward the development of molecular models that can fully describe the early nucleation process of zeolites in combination with experiments.

Open Access version available at UGent repository

DOI

http://dx.doi.org/10.1021/acs.chemmater.4c00285

The Gradient Curves Method: An improved strategy for the derivation of molecular mechanics valence force fields from ab initio data

T. Verstraelen, D. Van Neck, P.W. Ayers, V. Van Speybroeck, M. Waroquier

LECTURE SERIES ON COMPUTER AND COMPUTATIONAL SCIENCES

Volume 7A-B, page 576 -+

2006

P1

Abstract

A novel force-field parameterization procedure[1] is proposed that surmounts well-known difficulties of the conventional least squares parameterization. The multidimensional ab initio training data are first transformed into individual one-dimensional data sets, each associated with one term in the force-field model. In the second step conventional methods call be used to fit each energy term separately to its corresponding data set. The first step call be completed without any knowledge of the analytical expressions for the energy terms. Moreover the transformed data sets dictate the form of these expressions, which makes the method very suitable for deriving valence force fields. During the transformation in the first step, continuity and least-norm criteria are imposed. The latter facilitate the intuitive physical interpretation of the energy terms that are fitted to the transformed data sets, a prerequisite for transferable force fields. Benchmark parameterizations have been performed oil three small molecules, showing that the new method results in physically intuitive energy terms, exactly when a conventional parameterization would suffer from parameter correlations, i.e. when the number of redundant internal coordinates in the force-field model increases.

DOI

http://dx.doi.org/

The significance of fluctuating charges for molecular polarizability and dispersion coefficients

Y. Cheng, T. Verstraelen

The Journal of Chemical Physics

Volume 159, Issue 9

2023

A1

Abstract

The inŕuence of ŕuctuating charges or charge ŕow on the dynamic linear response properties of isolated molecules from the TS42 database is evaluated, with particular emphasis on dipole polarizability and C6 dispersion coefficients. Two new descriptors are deőned to quantify the charge-ŕow contribution to response properties, making use of the recoupled dipole polarizability to separate isotropic and anisotropic components. Molecular polarizabilities are calculated using the “frequency-dependent atom-condensed Kohn-Sham density functional theory approximated to second orderž, i.e. the ACKS2ω model. With ACKS2ω, the charge-ŕow contribution can be constructed in two conceptually distinct ways, which appear to yield compatible results. The charge-ŕow contribution is signiőcantly affected by molecular geometry and the presence of polarizable bonds, in line with previous studies. We show that the charge-ŕow contribution qualitatively reproduces the polarizability anisotropy. The contribution to the anisotropic C6 coefficients is less pronounced, but cannot be neglected. The effect of ŕuctuating charges is only negligible for small molecules with at most one non-hydrogen atom. They become important and sometimes dominant for larger molecules or when highly polarizable bonds are present, such as conjugated, double or triple bonds. Charge ŕow contributions cannot be explained in terms of individual atomic properties, because they are affected by non-local features such as chemical bonding and geometry. Therefore, polarizable force őelds and dispersion models can beneőt from the explicit modeling of charge ŕow.

DOI

http://dx.doi.org/10.1063/5.0163842

DFT-Quality Adsorption Simulations in Metal–Organic Frameworks Enabled by Machine Learning Potentials

R. Goeminne, L. Vanduyfhuys, V. Van Speybroeck, T. Verstraelen

Journal of Chemical Theory and Computation (JCTC)

19, 18, 6313-6325

2023

A1

Abstract

Nanoporous materials such as metal–organic frameworks (MOFs) have been extensively studied for their potential for adsorption and separation applications. In this respect, grand canonical Monte Carlo (GCMC) simulations have become a well-established tool for computational screenings of the adsorption properties of large sets of MOFs. However, their reliance on empirical force field potentials has limited the accuracy with which this tool can be applied to MOFs with challenging chemical environments such as open-metal sites. On the other hand, density-functional theory (DFT) is too computationally demanding to be routinely employed in GCMC simulations due to the excessive number of required function evaluations. Therefore, we propose in this paper a protocol for training machine learning potentials (MLPs) on a limited set of DFT intermolecular interaction energies (and forces) of CO₂ in ZIF-8 and the open-metal site containing Mg-MOF-74, and use the MLPs to derive adsorption isotherms from first principles. We make use of the equivariant NequIP model which has demonstrated excellent data efficiency, and as such an error on the interaction energies below 0.2 kJ mol^–1 per adsorbate in ZIF-8 was attained. Its use in GCMC simulations results in highly accurate adsorption isotherms and heats of adsorption. For Mg-MOF-74, a large dependence of the obtained results on the used dispersion correction was observed, where PBE-MBD performs the best. Lastly, to test the transferability of the MLP trained on ZIF-8, it was applied to ZIF-3, ZIF-4, and ZIF-6, which resulted in large deviations in the predicted adsorption isotherms and heats of adsorption. Only when explicitly training on data for all ZIFs, accurate adsorption properties were obtained. As the proposed methodology is widely applicable to guest adsorption in nanoporous materials, it opens up the possibility for training general-purpose MLPs to perform highly accurate investigations of guest adsorption.

DOI

http://dx.doi.org/10.1021/acs.jctc.3c00495

Modeling Electronic Response Properties with an Explicit-Electron Machine Learning Potential

M. Cools-Ceuppens, J. Dambre, T. Verstraelen

Journal of Chemical Theory and Computation

18, 3, 1672-1691

2023

A1

Abstract

Explicit-electron force fields introduce electrons or electron pairs as semiclassical particles in force fields or empirical potentials, which are suitable for molecular dynamics simulations. Even though semiclassical electrons are a drastic simplification compared to a quantum-mechanical electronic wave function, they still retain a relatively detailed electronic model compared to conventional polarizable and reactive force fields. The ability of explicit-electron models to describe chemical reactions and electronic response properties has already been demonstrated, yet the description of short-range interactions for a broad range of chemical systems remains challenging. In this work, we present the electron machine learning potential (eMLP), a new explicit electron force field in which the short-range interactions are modeled with machine learning. The electron pair particles will be located at well-defined positions, derived from localized molecular orbitals or Wannier centers, naturally imposing the correct dielectric and piezoelectric behavior of the system. The eMLP is benchmarked on two newly constructed data sets: eQM7, an extension of the QM7 data set for small molecules, and a data set for the crystalline beta-glycine. It is shown that the eMLP can predict dipole moments, polarizabilities, and IR-spectra of unseen molecules with high precision. Furthermore, a variety of response properties, for example, stiffness or piezoelectric constants, can be accurately reproduced.

Green Open Access

DOI

http://dx.doi.org/10.1021/acs.jctc.1c00978

An information-theoretic approach to basis-set fitting of electron densities and other non-negative functions

A. Tehrani, J. S. M. Anderson, D. Chakraborty, J. I. Rodriguez-Hernandez, D. C. Thompson, T. Verstraelen, P. W. Ayers, F. Heidar-Zadeh

Journal of Computational Chemistry

2023

A1

Abstract

The numerical ill-conditioning associated with approximating an electron density with a convex sum of Gaussian or Slater-type functions is overcome by using the (extended) Kullback–Leibler divergence to measure the deviation between the target and approximate density. The optimized densities are non-negative and normalized, and they are accurate enough to be used in applications related to molecular similarity, the topology of the electron density, and numerical molecular integration. This robust, efficient, and general approach can be used to fit any non-negative normalized functions (e.g., the kinetic energy density and molecular electron density) to a convex sum of non-negative basis functions. We present a fixed-point iteration method for optimizing the Kullback–Leibler divergence and compare it to conventional gradient-based optimization methods. These algorithms are released through the free and open-source BFit package, which also includes a L2-norm squared optimization routine applicable to any square-integrable scalar function.

Green Open Access

DOI

http://dx.doi.org/10.1002/jcc.27170

Modeling electronic response properties with an explicit-electron machine learning potential

M. Cools-Ceuppens, J. Dambre, T. Verstraelen

Journal of Chemical Theory and Computation

Volume 18, Issue 3, Pages 1672-1691

2023

A1

Abstract

Explicit-electron force fields introduce electrons or electron pairs as semiclassical particles in force fields or empirical potentials, which are suitable for molecular dynamics simulations. Even though semiclassical electrons are a drastic simplification compared to a quantum-mechanical electronic wave function, they still retain a relatively detailed electronic model compared to conventional polarizable and reactive force fields. The ability of explicit-electron models to describe chemical reactions and electronic response properties has already been demonstrated, yet the description of short-range interactions for a broad range of chemical systems remains challenging. In this work, we present the electron machine learning potential (eMLP), a new explicit electron force field in which the short-range interactions are modeled with machine learning. The electron pair particles will be located at well-defined positions, derived from localized molecular orbitals or Wannier centers, naturally imposing the correct dielectric and piezoelectric behavior of the system. The eMLP is benchmarked on two newly constructed data sets: eQM7, an extension of the QM7 data set for small molecules, and a data set for the crystalline beta-glycine. It is shown that the eMLP can predict dipole moments, polarizabilities, and IR-spectra of unseen molecules with high precision. Furthermore, a variety of response properties, for example, stiffness or piezoelectric constants, can be accurately reproduced.

DOI

https://doi.org/10.1021/acs.jctc.1c00978

A Reactive Molecular Dynamics Study of Chlorinated Organic Compounds. Part II: A ChemTraYzer Study of Chlorinated Dibenzofuran Formation and Decomposition Processes

L. Krep, F. Schmalz, F. Solbach, L. Komissarov, T. Nevolianis, W. A. Kopp, T. Verstraelen, K. Leonhard

ChemPhysChem

24, 7

2023

A1

Abstract

In our two-paper series, we first present the development of ReaxFF CHOCl parameters using the recently published ParAMS parametrization tool. In this second part, we update the reactive Molecular Dynamics - Quantum Mechanics coupling scheme ChemTraYzer and combine it with our new ReaxFF parameters from Part I to study formation and decomposition processes of chlorinated dibenzofurans. We introduce a self-learning method for recovering failed transition-state searches that improves the overall ChemTraYzer transition-state search success rate by 10 percentage points to a total of 48 %. With ChemTraYzer, we automatically find and quantify more than 500 reactions using transition state theory and DFT. Among the discovered chlorinated dibenzofuran reactions are numerous reactions that are new to the literature. In three case studies, we discuss the set of reactions that are most relevant to the dibenzofuran literature: (i) bimolecular reactions of the chlorinated-dibenzofuran precursors phenoxy radical and 1,3,5-trichlorobenzene, (ii) dibenzofuran chlorination and pyrolysis, and (iii) oxidation of chlorinated dibenzofurans.

DOI

http://dx.doi.org/10.1002/cphc.202200783

You are here

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Abstract

DOI

Pages