Maximum Likelihood Training of Score-Based Diffusion Models

Neural Information Processing Systems (NeurIPS), 2021

22 January 2021

Abstract

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses. The log-likelihood of score-based models can be tractably computed through a connection to continuous normalizing flows, but log-likelihood is not directly optimized by the weighted combination of score matching losses. We show that for a specific weighting scheme, the objective upper bounds the negative log-likelihood, thus enabling approximate maximum likelihood training of score-based models. We empirically observe that maximum likelihood training consistently improves the likelihood of score-based models across multiple datasets, stochastic processes, and model architectures. Our best models achieve negative log-likelihoods of 2.74 and 3.76 bits/dim on CIFAR-10 and down-sampled ImageNet, outperforming all existing likelihood-based models.

View on arXiv

Comments on this paper