72

Optimal Neural Network Approximation for High-Dimensional Continuous Functions

Main:7 Pages
4 Figures
Bibliography:2 Pages
Abstract

Recently, the authors of Shen Yang Zhang (JMLR, 2022) developed a neural network with width 36d(2d+1)36d(2d + 1) and depth 1111, which utilizes a special activation function called the elementary universal activation function, to achieve the super approximation property for functions in C([a,b]d)C([a,b]^d). That is, the constructed network only requires a fixed number of neurons to approximate a dd-variate continuous function on a dd-dimensional hypercube with arbitrary accuracy. Their network uses O(d2)\mathcal{O}(d^2) fixed neurons. One natural question to address is whether we can reduce the number of these neurons in such a network. By leveraging a variant of the Kolmogorov Superposition Theorem, our analysis shows that there is a neural network generated by the elementary universal activation function with only 366d+365366d +365 fixed, intrinsic (non-repeated) neurons that attains this super approximation property. Furthermore, we present a family of continuous functions that requires at least width dd, and therefore at least dd intrinsic neurons, to achieve arbitrary accuracy in its approximation. This shows that the requirement of O(d)\mathcal{O}(d) intrinsic neurons is optimal in the sense that it grows linearly with the input dimension dd, unlike some approximation methods where parameters may grow exponentially with dd.

View on arXiv
Comments on this paper