ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21205
268
0
v1v2 (latest)

EF-VI: Enhancing End-Frame Injection for Video Inbetweening

27 May 2025
Liuhan Chen
Xiaodong Cun
Xiaoyu Li
Xianyi He
Shenghai Yuan
Jie Chen
Mingyu Ding
Lichao Sun
    VGen
ArXiv (abs)PDFHTMLHuggingFace (5 upvotes)
Main:7 Pages
12 Figures
Bibliography:3 Pages
10 Tables
Appendix:7 Pages
Abstract

Video inbetweening aims to synthesize intermediate video sequences conditioned on the given start and end frames. Current state-of-the-art methods primarily extend large-scale pre-trained Image-to-Video Diffusion Models (I2V-DMs) by incorporating the end-frame condition via direct fine-tuning or temporally bidirectional sampling. However, the former results in a weak end-frame constraint, while the latter inevitably disrupts the input representation of video frames, leading to suboptimal performance. To improve the end-frame constraint while avoiding disruption of the input representation, we propose a novel video inbetweening framework specific to recent and more powerful transformer-based I2V-DMs, termed EF-VI. It efficiently strengthens the end-frame constraint by utilizing an enhanced injection. This is based on our proposed well-designed lightweight module, termed EF-Net, which encodes only the end frame and expands it into temporally adaptive frame-wise features injected into the I2V-DM. Extensive experiments demonstrate the superiority of our EF-VI compared with other baselines.

View on arXiv
Comments on this paper