Importance Weighted Score Matching for Diffusion Samplers with Enhanced Mode Coverage

26 May 2025

Abstract

Training neural samplers directly from unnormalized densities without access to target distribution samples presents a significant challenge. A critical desideratum in these settings is achieving comprehensive mode coverage, ensuring the sampler captures the full diversity of the target distribution. However, prevailing methods often circumvent the lack of target data by optimizing reverse KL-based objectives. Such objectives inherently exhibit mode-seeking behavior, potentially leading to incomplete representation of the underlying distribution. While alternative approaches strive for better mode coverage, they typically rely on implicit mechanisms like heuristics or iterative refinement. In this work, we propose a principled approach for training diffusion-based samplers by directly targeting an objective analogous to the forward KL divergence, which is conceptually known to encourage mode coverage. We introduce \textit{Importance Weighted Score Matching}, a method that optimizes this desired mode-covering objective by re-weighting the score matching loss using tractable importance sampling estimates, thereby overcoming the absence of target distribution data. We also provide theoretical analysis of the bias and variance for our proposed Monte Carlo estimator and the practical loss function used in our method. Experiments on increasingly complex multi-modal distributions, including 2D Gaussian Mixture Models with up to 120 modes and challenging particle systems with inherent symmetries -- demonstrate that our approach consistently outperforms existing neural samplers across all distributional distance metrics, achieving state-of-the-art results on all benchmarks.

View on arXiv

@article{wang2025_2505.19431,
  title={ Importance Weighted Score Matching for Diffusion Samplers with Enhanced Mode Coverage },
  author={ Chenguang Wang and Xiaoyu Zhang and Kaiyuan Cui and Weichen Zhao and Yongtao Guan and Tianshu Yu },
  journal={arXiv preprint arXiv:2505.19431},
  year={ 2025 }
}

Comments on this paper