Activation Functions Not To Active: A Plausible Theory on Interpreting Neural Networks

1 May 2023

Abstract

Researchers commonly believe that neural networks model a high-dimensional space but cannot give a clear definition of this space. What is this space? What is its dimension? And does it has finite dimensions? In this paper, we develop a plausible theory on interpreting neural networks in terms of the role of activation functions in neural networks and define a high-dimensional (more precisely, an infinite-dimensional) space. We conjunction that the activation function acts as a magnifying function that maps the low-dimensional linear space into an infinite-dimensional space. Given a dataset with each example of $d$ features $f_1$ , $f_2$ , $\cdots$ , $f_d$ , we believe that NNs model a special space with infinite dimensions, each of which is a monomial $\prod_{i_1, i_2, \cdots, i_d} f_1^{i_1} f_2^{i_2} \cdots f_d^{i_d}$ for some non-negative integers ${i_1, i_2, \cdots, i_d} \in \mathbb{Z}_{0}^{+}=\{0,1,2,3,\ldots\} $. We term such an infinite-dimensional space $\textit{ Super Space (SS)}$ . We see such a dimension as the minimum information unit. Every neuron node previously through an activation layer in NNs is a $\textit{ Super Plane (SP) }$ , which is actually a polynomial of infinite degree. This $\textit{ Super Space }$ is something like a coordinate system, in which every multivalue function can be represented by a $\textit{ Super Plane }$ . From this perspective, a neural network for regression tasks can be seen as an extension of linear regression, i.e. an advanced variant of linear regression with infinite-dimensional features, just as logistic regression is an extension of linear regression. We also show that training NNs could at least be reduced to solving a system of nonlinear equations.

View on arXiv

Comments on this paper