0

Who Said Neural Networks Aren't Linear?

Main:9 Pages
11 Figures
Bibliography:3 Pages
5 Tables
Appendix:10 Pages
Abstract

Neural networks are famously nonlinear. However, linearity is defined relative to a pair of vector spaces, ff::XX\toYY. Is it possible to identify a pair of non-standard vector spaces for which a conventionally nonlinear function is, in fact, linear? This paper introduces a method that makes such vector spaces explicit by construction. We find that if we sandwich a linear operator AA between two invertible neural networks, f(x)=gy1(Agx(x))f(x)=g_y^{-1}(A g_x(x)), then the corresponding vector spaces XX and YY are induced by newly defined addition and scaling actions derived from gxg_x and gyg_y. We term this kind of architecture a Linearizer. This framework makes the entire arsenal of linear algebra, including SVD, pseudo-inverse, orthogonal projection and more, applicable to nonlinear mappings. Furthermore, we show that the composition of two Linearizers that share a neural network is also a Linearizer. We leverage this property and demonstrate that training diffusion models using our architecture makes the hundreds of sampling steps collapse into a single step. We further utilize our framework to enforce idempotency (i.e. f(f(x))=f(x)f(f(x))=f(x)) on networks leading to a globally projective generative model and to demonstrate modular style transfer.

View on arXiv
Comments on this paper