Deep Network Approximation: Beyond ReLU to Diverse Activation Functions

This paper explores the expressive power of deep neural networks for a diverse range of activation functions. An activation function set is defined to encompass the majority of commonly used activation functions, such as , , , , , , , , , , , , , , , , and . We demonstrate that for any activation function , a network of width and depth can be approximated to arbitrary precision by a -activated network of width and depth on any bounded set. This finding enables the extension of most approximation results achieved with networks to a wide variety of other activation functions, albeit with slightly increased constants. Significantly, we establish that the (width,depth) scaling factors can be further reduced from to if falls within a specific subset of . This subset includes activation functions such as , , , , , , , and .
View on arXiv