Uncertainty estimation by committee models for molecular dynamics and thermodynamic averages

10 November 2020

G. Imbalzano

Abstract

Machine learning models have emerged as a very effective strategy to sidestep time-consuming electronic-structure calculations, and obtain an estimate of the energy and properties of an atomistic system - enabling simulations of greater size, time scale and complexity, without sacrificing the accuracy of first-principles calculations. Given the interpolative nature of these models, the accuracy of predictions depends on the position in phase space, and it is crucial to obtain an estimate of the error that derives from the finite number of reference structures included during the model training. Committee models that combine information from multiple training exercises, performed for instance using different reference data, use the spread of their predictions to obtain reliable estimates of the uncertainty of a single-point calculation, such as a lattice energy. Such uncertainty quantification is particularly useful in the context of molecular dynamics simulations. Here we discuss how it can be used, together with a baseline energy model, or a more robust although less accurate interatomic potential, to obtain more resilient simulations and to support active-learning strategies. Furthermore, we introduce an on-the-fly reweighing scheme that makes it possible to estimate the uncertainty in the thermodynamic averages extracted from long trajectories, incorporating both the error in the single-point predictions of the properties and the error due to the distortion of the sampling probability. We present examples covering different types of structural and thermodynamic properties, and systems as diverse as water and liquid gallium.

View on arXiv

Comments on this paper