244

Fast Feature Sampling from Implicit Infinite-width Models

Abstract

Infinitely-wide models have been succeeded to facilitate the theoretical understanding of modern, large-scale and nonlinear models such as neural networks. As the number pp of features exceeds the size nn of the training dataset, the model tends to be linear and the optimization problem tends to be convex because the design matrix SS tends to be full row rank. A variety of recent over-parametrization schemes result in to estimate a pseudo-inverse operator SS^{\dagger}. In this study, we establish a new fast sampling method that approximates a pseudo-inverse operator without explicitly computing it. Technically speaking, we develop the kernel mean embedding (KME), maximum mean discrepancy (MMD) and generalized kernel quadrature (GKQ) for parameter distributions that achieve a fast approximation rate O(ep)O(e^{-p}), which is faster than the traditional Barron's rate. Convergence analysis with the local Rademacher complexity shows that our method can achieve a fast learning rate O~(1/n)\widetilde{O}(1/n).

View on arXiv
Comments on this paper