The Free Transformer

Main:11 Pages
9 Figures
Bibliography:2 Pages
3 Tables
Appendix:5 Pages
Abstract
We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks.
View on arXivComments on this paper
