Synthesizing Dynamic Textures and Sounds by Spatial-Temporal Generative ConvNet

3 June 2016

Abstract

Dynamic textures are spatial-temporal processes that exhibit statistical stationarity or stochastic repetitiveness in the temporal dimension. In this paper, we study the problem of modeling and synthesizing dynamic textures using a generative version of the convolution neural network (ConvNet or CNN) that consists of multiple layers of spatial-temporal filters to capture the spatial-temporal patterns in the dynamic textures. We show that such spatial-temporal generative ConvNet can synthesize realistic dynamic textures. We also apply the temporal generative ConvNet to the one-dimensional sound data, and show that the model can synthesize realistic natural and man-made sounds. The videos and sounds can be found at http://www.stat.ucla.edu/~jxie/STGConvNet/STGConvNet.html

View on arXiv

Comments on this paper