105
v1v2 (latest)

An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning

Abstract

Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces X\mathcal{X} and Y\mathcal{Y} using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the target space Y\mathcal{Y} is a topological vector space. We overcome this limitation by ``randomization'': our approximators output discrete probability measures over Y\mathcal{Y}. When X\mathcal{X} and Y\mathcal{Y} are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for H\"{o}lder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of X\mathcal{X} and Y\mathcal{Y}. For barycentric Y\mathcal{Y}, including Banach spaces, R\mathbb{R}-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to Y\mathcal{Y}-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.

View on arXiv
Comments on this paper