153
v1v2 (latest)

Matcha: Multi-Stage Riemannian Flow Matching for Accurate and Physically Valid Molecular Docking

Main:9 Pages
20 Figures
Bibliography:3 Pages
1 Tables
Appendix:8 Pages
Abstract

Accurate prediction of protein-ligand binding poses is crucial for structure-based drug design, yet existing methods struggle to balance speed, accuracy, and physical plausibility. We introduce Matcha, a novel molecular docking pipeline that combines multi-stage flow matching with learned scoring and physical validity filtering. Our approach consists of three sequential stages applied consecutively to refine docking predictions, each implemented as a flow matching model operating on appropriate geometric spaces (R3\mathbb{R}^3, SO(3)\mathrm{SO}(3), and SO(2)\mathrm{SO}(2)). We enhance the prediction quality through a dedicated scoring model and apply unsupervised physical validity filters to eliminate unrealistic poses. Compared to various approaches, Matcha demonstrates superior performance on Astex and PDBbind test sets in terms of docking success rate and physical plausibility. Moreover, our method works approximately 25 times faster than modern large-scale co-folding models. The model weights and inference code to reproduce our results are available atthis https URL.

View on arXiv
Comments on this paper