Unified Arbitrary-Time Video Frame Interpolation and Prediction
Video frame interpolation and prediction aim to synthesize frames in-between and subsequent to existing frames, respectively. Despite being closely-related, these two tasks are traditionally studied with different model architectures, or same architecture but individually trained weights. Furthermore, while arbitrary-time interpolation has been extensively studied, the value of arbitrary-time prediction has been largely overlooked. In this work, we present uniVIP - unified arbitrary-time Video Interpolation and Prediction. Technically, we firstly extend an interpolation-only network for arbitrary-time interpolation and prediction, with a special input channel for task (interpolation or prediction) encoding. Then, we show how to train a unified model on common triplet frames. Our uniVIP provides competitive results for video interpolation, and outperforms existing state-of-the-arts for video prediction. Codes will be available at:this https URL
View on arXiv@article{jin2025_2503.02316, title={ Unified Arbitrary-Time Video Frame Interpolation and Prediction }, author={ Xin Jin and Longhai Wu and Jie Chen and Ilhyun Cho and Cheul-Hee Hahm }, journal={arXiv preprint arXiv:2503.02316}, year={ 2025 } }