207

Generative Kaleidoscopic Networks

Abstract

We discovered that the Deep ReLU networks (or Multilayer Perceptron architecture) demonstrate an 'over-generalization' phenomenon. That is, the output values for the inputs that were not seen during training are mapped close to the output range that were observed during the learning process. In other words, the MLP learns a many-to-one mapping and this effect is more prominent as we increase the number of layers or depth of the MLP. We utilize this property of Deep ReLU networks to design a dataset kaleidoscope, termed as 'Generative Kaleidoscopic Networks'. Briefly, if we learn a MLP to map from input xRDx\in\mathbb{R}^D to itself fN(x)xf_\mathcal{N}(x)\rightarrow x, the 'Kaleidoscopic sampling' procedure starts with a random input noise zRDz\in\mathbb{R}^D and recursively applies fN(fN(z))f_\mathcal{N}(\cdots f_\mathcal{N}(z)\cdots ). After a burn-in period duration, we start observing samples from the input distribution and we found that deeper the MLP, higher is the quality of samples recovered. Scope: We observed this phenomenon to various degrees for the other deep learning architectures like CNNs, Transformers & U-Nets and we are currently investigating them further.

View on arXiv
Comments on this paper