117
176

Variational cross-validation of slow dynamical modes in molecular kinetics

Abstract

We consider the problem of robustly determining the mm slowest dynamical modes of a reversible dynamical system, with a particular focus on the analysis of equilibrium molecular dynamics simulations. We show that the problem can be formulated as the variational optimization of a single scalar functional, a generalized matrix Rayleigh quotient (GMRQ), which measures the ability of a rank-mm projection operator to capture the slow dynamics of the system. While a variational theorem bounds the GMRQ from above by the sum of the first mm eigenvalues of the system's propagator, we show that this bound can be violated when the requisite matrix elements are estimated subject to statistical uncertainty. Furthermore, this overfitting can be detected and avoided through cross-validation in which the GMRQ is evaluated for the purpose of model selection on data that was held out during training. These result make it possible to, for the first time, construct a unified, consistent objective function for the parameterization of Markov state models for protein dynamics which captures the tradeoff between systematic and statistical errors.

View on arXiv
Comments on this paper