Optimal Neural Network Approximation for High-Dimensional Continuous Functions

4 September 2024

Ayan Maiti

Michelle Michelle

Haizhao Yang

ArXiv (abs)PDF HTML

Main:7 Pages

4 Figures

Bibliography:2 Pages

Abstract

Recently, the authors of Shen Yang Zhang (JMLR, 2022) developed a neural network with width $36d(2d + 1)$ and depth $11$ , which utilizes a special activation function called the elementary universal activation function, to achieve the super approximation property for functions in $C([a,b]^d)$ . That is, the constructed network only requires a fixed number of neurons to approximate a $d$ -variate continuous function on a $d$ -dimensional hypercube with arbitrary accuracy. Their network uses $\mathcal{O}(d^2)$ fixed neurons. One natural question to address is whether we can reduce the number of these neurons in such a network. By leveraging a variant of the Kolmogorov Superposition Theorem, our analysis shows that there is a neural network generated by the elementary universal activation function with only $366d +365$ fixed, intrinsic (non-repeated) neurons that attains this super approximation property. Furthermore, we present a family of continuous functions that requires at least width $d$ , and therefore at least $d$ intrinsic neurons, to achieve arbitrary accuracy in its approximation. This shows that the requirement of $\mathcal{O}(d)$ intrinsic neurons is optimal in the sense that it grows linearly with the input dimension $d$ , unlike some approximation methods where parameters may grow exponentially with $d$ .

View on arXiv

Comments on this paper