Monge, Bregman and Occam: Interpretable Optimal Transport in High-Dimensions with Feature-Sparse Maps

8 February 2023

Pierre Ablin

Abstract

Optimal transport (OT) theory focuses, among all maps $T:\mathbb{R}^d\rightarrow \mathbb{R}^d$ that can morph a probability measure onto another, on those that are the ``thriftiest'', i.e. such that the averaged cost $c(x, T(x))$ between $x$ and its image $T(x)$ be as small as possible. Many computational approaches have been proposed to estimate such Monge maps when $c$ is the $\ell_2^2$ distance, e.g., using entropic maps [Pooladian'22], or neural networks [Makkuva'20, Korotin'20]. We propose a new model for transport maps, built on a family of translation invariant costs $c(x, y):=h(x-y)$ , where $h:=\tfrac{1}{2}\|\cdot\|_2^2+\tau$ and $\tau$ is a regularizer. We propose a generalization of the entropic map suitable for $h$ , and highlight a surprising link tying it with the Bregman centroids of the divergence $D_h$ generated by $h$ , and the proximal operator of $\tau$ . We show that choosing a sparsity-inducing norm for $\tau$ results in maps that apply Occam's razor to transport, in the sense that the displacement vectors $\Delta(x):= T(x)-x$ they induce are sparse, with a sparsity pattern that varies depending on $x$ . We showcase the ability of our method to estimate meaningful OT maps for high-dimensional single-cell transcription data, in the $34000$ - $d$ space of gene counts for cells, without using dimensionality reduction, thus retaining the ability to interpret all displacements at the gene level.

View on arXiv

Comments on this paper