Deep Automodulators
We introduce a new family of generative neural network models called automodulators. These autoencoder-like networks can faithfully reproduce individual real-world input images like autoencoders, and also generate a fused sample from an arbitrary combination of several such images, allowing "style-mixing" and other new applications. An automodulator decouples the data flow of decoder operations from statistical properties thereof and uses the latent vector to modulate the former by the latter, with a principled approach for mutual disentanglement of decoder layers. This is the first general-purpose model to successfully apply this principle on existing input images, whereas prior work has focused on random sampling in GANs. We introduce novel techniques for stable unsupervised training of the model on four high-resolution data sets. Besides style-mixing, we show state-of-the-art results in autoencoder comparison, and visual image quality nearly indistinguishable from state-of-the-art GANs. We expect the automodulator variants to become a useful building block for image applications and other data domains.
View on arXiv