Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic
Generative Adversarial Networks
- GAN
Taking a photo outside, can we predict the immediate future, like how the cloud would move in the sky? We answer this question by presenting a generative adversarial network (GAN) based two-stage approach to generating realistic time-lapse videos of high resolution. Given the first frame, our model learns to generate long-term future frames. The first stage aims to generate videos of similar content as that in the input frame and of plausible motion dynamics. The second stage refines the generated video from the first stage by enforcing it to be closer to real videos with regard to dynamics. To further encourage realistic motion in the final generated video, Gram matrix is employed to model the motion more precisely. We build a large scale time-lapse dataset, and test our approach on this new dataset. Using our model, we are able to generate up to resolution videos for 32 frames in a single forward pass. Quantitative and qualitative experiment results demonstrate the superiority of our method over the state-of-the-art methods.
View on arXiv