22
22

Optimal transport map estimation in general function spaces

Abstract

We study the problem of estimating a function TT given independent samples from a distribution PP and from the pushforward distribution TPT_\sharp P. This setting is motivated by applications in the sciences, where TT represents the evolution of a physical system over time, and in machine learning, where, for example, TT may represent a transformation learned by a deep neural network trained for a generative modeling task. To ensure identifiability, we assume that T=φ0T = \nabla \varphi_0 is the gradient of a convex function, in which case TT is known as an \emph{optimal transport map}. Prior work has studied the estimation of TT under the assumption that it lies in a H\"older class, but general theory is lacking. We present a unified methodology for obtaining rates of estimation of optimal transport maps in general function spaces. Our assumptions are significantly weaker than those appearing in the literature: we require only that the source measure PP satisfy a Poincar\é inequality and that the optimal map be the gradient of a smooth convex function that lies in a space whose metric entropy can be controlled. As a special case, we recover known estimation rates for H\"older transport maps, but also obtain nearly sharp results in many settings not covered by prior work. For example, we provide the first statistical rates of estimation when PP is the normal distribution and the transport map is given by an infinite-width shallow neural network.

View on arXiv
Comments on this paper