6
3

Deep Neural Networks with ReLU-Sine-Exponential Activations Break Curse of Dimensionality in Approximation on Hölder Class

Abstract

In this paper, we construct neural networks with ReLU, sine and 2x2^x as activation functions. For general continuous ff defined on [0,1]d[0,1]^d with continuity modulus ωf()\omega_f(\cdot), we construct ReLU-sine-2x2^x networks that enjoy an approximation rate O(ωf(d)2M+ωf(dN))\mathcal{O}(\omega_f(\sqrt{d})\cdot2^{-M}+\omega_{f}\left(\frac{\sqrt{d}}{N}\right)), where M,NN+M,N\in \mathbb{N}^{+} denote the hyperparameters related to widths of the networks. As a consequence, we can construct ReLU-sine-2x2^x network with the depth 55 and width max{2d3/2(3μϵ)1/α,2log23μdα/22ϵ+2}\max\left\{\left\lceil2d^{3/2}\left(\frac{3\mu}{\epsilon}\right)^{1/{\alpha}}\right\rceil,2\left\lceil\log_2\frac{3\mu d^{\alpha/2}}{2\epsilon}\right\rceil+2\right\} that approximates fHμα([0,1]d)f\in \mathcal{H}_{\mu}^{\alpha}([0,1]^d) within a given tolerance ϵ>0\epsilon >0 measured in LpL^p norm p[1,)p\in[1,\infty), where Hμα([0,1]d)\mathcal{H}_{\mu}^{\alpha}([0,1]^d) denotes the H\"older continuous function class defined on [0,1]d[0,1]^d with order α(0,1]\alpha \in (0,1] and constant μ>0\mu > 0. Therefore, the ReLU-sine-2x2^x networks overcome the curse of dimensionality on Hμα([0,1]d)\mathcal{H}_{\mu}^{\alpha}([0,1]^d). In addition to its supper expressive power, functions implemented by ReLU-sine-2x2^x networks are (generalized) differentiable, enabling us to apply SGD to train.

View on arXiv
Comments on this paper