Who Said Neural Networks Aren't Linear?

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, . Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator between two invertible neural networks, , then the corresponding vector spaces and are induced by newly defined addition and scaling actions derived from and . We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. ) on networks leading to a globally projective generative model and to demonstrate modular style transfer.
View on arXiv