Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics

25 April 2025

Abstract

Drug-protein binding and dissociation dynamics are fundamental to understanding molecular interactions in biological systems. While many tools for drug-protein interaction studies have emerged, especially artificial intelligence (AI)-based generative models, predictive tools on binding/dissociation kinetics and dynamics are still limited. We propose a novel research paradigm that combines molecular dynamics (MD) simulations, enhanced sampling, and AI generative models to address this issue. We propose an enhanced sampling strategy to efficiently implement the drug-protein dissociation process in MD simulations and estimate the free energy surface (FES). We constructed a program pipeline of MD simulations based on this sampling strategy, thus generating a dataset including 26,612 drug-protein dissociation trajectories containing about 13 million frames. We named this dissociation dynamics dataset DD-13M and used it to train a deep equivariant generative model UnbindingFlow, which can generate collision-free dissociation trajectories. The DD-13M database and UnbindingFlow model represent a significant advancement in computational structural biology, and we anticipate its broad applicability in machine learning studies of drug-protein interactions. Our ongoing efforts focus on expanding this methodology to encompass a broader spectrum of drug-protein complexes and exploring novel applications in pathway prediction.

View on arXiv

@article{li2025_2504.18367,
  title={ Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics },
  author={ Maodong Li and Jiying Zhang and Bin Feng and Wenqi Zeng and Dechin Chen and Zhijun Pan and Yu Li and Zijing Liu and Yi Isaac Yang },
  journal={arXiv preprint arXiv:2504.18367},
  year={ 2025 }
}

Comments on this paper