ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.04145
  4. Cited By
I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion
  Models

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

7 November 2023
Shiwei Zhang
Jiayu Wang
Yingya Zhang
Kang Zhao
Hangjie Yuan
Z. Qin
Xiang Wang
Deli Zhao
Jingren Zhou
    DiffM
    VGen
ArXivPDFHTML

Papers citing "I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models"

50 / 155 papers shown
Title
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization
Jae-Won Chung
Jiachen Liu
Jeff J. Ma
Ruofan Wu
Oh Jun Kweon
Yuxuan Xia
Zhiyu Wu
Mosharaf Chowdhury
16
0
0
09 May 2025
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation
Teng Hu
Zhentao Yu
Zhengguang Zhou
Sen Liang
Yuan Zhou
Qin Lin
Qinglin Lu
DiffM
VGen
50
0
0
07 May 2025
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi
Sundar Sripada V. S.
Harsh Goel
Sahil Shah
Sandeep P. Chinchali
DiffM
VGen
79
0
0
24 Apr 2025
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models
Xuming Hu
H. Li
J. Li
Aiwei Liu
WIGM
VGen
48
1
0
23 Apr 2025
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Ying Li
Xiaobao Wei
Xiaowei Chi
Y. K. Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
VGen
36
0
0
23 Apr 2025
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
DriVerse: Navigation World Model for Driving Simulation via Multimodal Trajectory Prompting and Motion Alignment
X. Li
Chenming Wu
Zhao Yang
Zhihao Xu
Dingkang Liang
Y. Zhang
Ji Wan
J. Wang
VGen
67
1
0
22 Apr 2025
Visual Prompting for One-shot Controllable Video Editing without Inversion
Visual Prompting for One-shot Controllable Video Editing without Inversion
Zhengbo Zhang
Yuxi Zhou
Duo Peng
Joo-Hwee Lim
Zhigang Tu
De Wen Soh
Lin Geng Foo
DiffM
37
1
0
19 Apr 2025
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Physical Reservoir Computing in Hook-Shaped Rover Wheel Spokes for Real-Time Terrain Identification
Xiao Jin
Zihan Wang
Zhenhua Yu
Changrak Choi
Kalind Carpenter
T. Nanayakkara
23
0
0
17 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
J. Xu
Y. Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Y. Zhang
Rui Feng
Weidi Xie
DiffM
46
0
0
16 Apr 2025
Taming Consistency Distillation for Accelerated Human Image Animation
Taming Consistency Distillation for Accelerated Human Image Animation
X. Wang
Shiwei Zhang
Hangjie Yuan
Yujie Wei
Y. Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
22
0
0
15 Apr 2025
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
UniAnimate-DiT: Human Image Animation with Large-Scale Video Diffusion Transformer
X. Wang
Shiwei Zhang
Longxiang Tang
Y. Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
16
0
0
15 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
37
0
0
11 Apr 2025
CamContextI2V: Context-aware Controllable Video Generation
CamContextI2V: Context-aware Controllable Video Generation
Luis Denninger
Sina Mokhtarzadeh Azar
Juergen Gall
VGen
28
0
0
08 Apr 2025
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Beyond Static Scenes: Camera-controllable Background Generation for Human Motion
Mingshuai Yao
Mengting Chen
Qinye Zhou
Y. Zhang
Ming-Yu Liu
...
Chen Ju
Shuai Xiao
Qingwen Liu
Jinsong Lan
Wangmeng Zuo
DiffM
VGen
26
1
0
01 Apr 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Y. Wang
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
75
3
0
27 Mar 2025
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Multi-Object Sketch Animation by Scene Decomposition and Motion Planning
Jingyu Liu
Zijie Xin
Yuhan Fu
Ruixiang Zhao
Bangxiang Lan
Xirong Li
39
0
0
25 Mar 2025
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
TransAnimate: Taming Layer Diffusion to Generate RGBA Video
Xuewei Chen
Zhimin Chen
Yiren Song
VGen
61
0
0
23 Mar 2025
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan
Ting Zhang
Ying Deng
Jiapei Zhang
Yeshuang Zhu
Zexi Jia
Jie Zhou
Jinchao Zhang
VGen
34
0
0
22 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
60
0
0
21 Mar 2025
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Generating, Fast and Slow: Scalable Parallel Video Generation with Video Interface Networks
Bhishma Dedhia
David Bourgin
Krishna Kumar Singh
Yuheng Li
Yan Kang
Zhan Xu
N. Jha
Y. Liu
DiffM
VGen
72
0
0
21 Mar 2025
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
FiVE: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
Minghan Li
C. Xie
Y. Wu
Lei Zhang
M. Wang
DiffM
VGen
50
0
0
17 Mar 2025
Long Context Tuning for Video Generation
Yuwei Guo
Ceyuan Yang
Ziyan Yang
Zhibei Ma
Zhijie Lin
Zhenheng Yang
Dahua Lin
Lu Jiang
DiffM
VGen
70
1
0
13 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Y. Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
62
0
0
13 Mar 2025
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image
Qi Zhao
Zhan Ma
Pan Zhou
VGen
67
0
0
13 Mar 2025
NIL: No-data Imitation Learning by Leveraging Pre-trained Video Diffusion Models
Mert Albaba
Chenhao Li
Markos Diomataris
Omid Taheri
Andreas Krause
M. Black
VGen
53
0
0
13 Mar 2025
Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework
Jing Wang
Fengzhuo Zhang
Xiaoli Li
Vincent Y. F. Tan
Tianyu Pang
Chao Du
Aixin Sun
Zhuoran Yang
VGen
59
1
0
12 Mar 2025
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
43
0
0
11 Mar 2025
VACE: All-in-One Video Creation and Editing
Zeyinzi Jiang
Zhen Han
Chaojie Mao
J. Zhang
Yulin Pan
Yu Liu
DiffM
VGen
41
4
0
10 Mar 2025
Automated Movie Generation via Multi-Agent CoT Planning
Weijia Wu
Zeyu Zhu
Mike Zheng Shou
VGen
67
1
0
10 Mar 2025
DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation
Runze Zhang
Guoguang Du
Xiaochuan Li
Qi Jia
Liang Jin
...
Zhenhua Guo
Yaqian Zhao
Xiaoli Gong
Rengang Li
Baoyu Fan
VGen
70
0
0
08 Mar 2025
Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think
Jie Tian
Xiaoye Qu
Zhenyi Lu
Wei Wei
Sichen Liu
Yu-Xi Cheng
DiffM
VGen
41
0
0
02 Mar 2025
FaceShot: Bring Any Character into Life
Junyao Gao
Yanan Sun
Fei Shen
Xin Jiang
Zhening Xing
Kai-xiang Chen
Cairong Zhao
CVBM
3DH
37
1
0
02 Mar 2025
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
FlexiDiT: Your Diffusion Transformer Can Easily Generate High-Quality Samples with Less Compute
Sotiris Anagnostidis
Gregor Bachmann
Yeongmin Kim
Jonas Kohler
Markos Georgopoulos
A. Sanakoyeu
Yuming Du
Albert Pumarola
Ali K. Thabet
Edgar Schönfeld
76
0
0
27 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
J. Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVM
VGen
AI4TS
49
0
0
25 Feb 2025
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach
Yunuo Chen
Junli Cao
Anil Kag
Vidit Goel
Sergei Korolev
Chenfanfu Jiang
Sergey Tulyakov
Jian Ren
DiffM
VGen
86
1
0
05 Feb 2025
IPO: Iterative Preference Optimization for Text-to-Video Generation
IPO: Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang
Zhiyu Tan
Xuecheng Nie
VGen
101
1
0
04 Feb 2025
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
Runyi Hu
J. Zhang
Y. Li
Jiwei Li
Qing-Wu Guo
Han Qiu
Tianwei Zhang
WIGM
VGen
74
4
0
24 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffM
VGen
37
0
0
11 Jan 2025
MEt3R: Measuring Multi-View Consistency in Generated Images
MEt3R: Measuring Multi-View Consistency in Generated Images
Mohammad Asim
Christopher Wewer
Thomas Wimmer
Bernt Schiele
J. E. Lenssen
EGVM
3DGS
VGen
46
7
0
10 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
35
10
0
08 Jan 2025
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
Rui Xie
Yinhong Liu
Penghao Zhou
Chen Zhao
Jun Zhou
K. Zhang
Z. Zhang
Jian Yang
Z. Yang
Ying Tai
VGen
DiffM
36
1
0
06 Jan 2025
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Vitron: A Unified Pixel-level Vision LLM for Understanding, Generating, Segmenting, Editing
Hao Fei
Shengqiong Wu
H. Zhang
Tat-Seng Chua
Shuicheng Yan
56
35
0
31 Dec 2024
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
ILDiff: Generate Transparent Animated Stickers by Implicit Layout Distillation
Ting Zhang
Zhiqiang Yuan
Yeshuang Zhu
Jinchao Zhang
DiffM
94
0
0
31 Dec 2024
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained
  Ego-Motion, Object Dynamics, and Scene Composition Control
GEM: A Generalizable Ego-Vision Multimodal World Model for Fine-Grained Ego-Motion, Object Dynamics, and Scene Composition Control
Mariam Hassan
Sebastian Stapf
Ahmad Rahimi
Pedro M B Rezende
Yasaman Haghighi
...
Mathieu Salzmann
Davide Scaramuzza
Marc Pollefeys
Paolo Favaro
Alexandre Alahi
VLM
VGen
64
4
0
15 Dec 2024
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention
  Mechanism
Dynamic Try-On: Taming Video Virtual Try-on with Dynamic Attention Mechanism
Jun Zheng
Jing Wang
Fuwei Zhao
Xujie Zhang
Xiaodan Liang
DiffM
VGen
73
0
0
13 Dec 2024
Owl-1: Omni World Model for Consistent Long Video Generation
Owl-1: Omni World Model for Consistent Long Video Generation
Yuanhui Huang
Wenzhao Zheng
Yuan Gao
Xin Tao
Pengfei Wan
Di Zhang
Jie Zhou
Jiwen Lu
VGen
VLM
82
0
0
12 Dec 2024
InfinityDrive: Breaking Time Limits in Driving World Models
InfinityDrive: Breaking Time Limits in Driving World Models
Xi Guo
C. Ding
Haoxuan Dou
Xin Zhang
Weixuan Tang
Wei Yu Wu
VGen
81
5
0
02 Dec 2024
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Fleximo: Towards Flexible Text-to-Human Motion Video Generation
Yuhang Zhang
Yuan Zhou
Zeyu Liu
Yuxuan Cai
Qiuyue Wang
Aidong Men
Huan Yang
VGen
DiffM
69
0
0
29 Nov 2024
SPAgent: Adaptive Task Decomposition and Model Selection for General
  Video Generation and Editing
SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Rong-Cheng Tu
Wenhao Sun
Zhao Jin
Jingyi Liao
Jiaxing Huang
Dacheng Tao
VGen
DiffM
92
3
0
28 Nov 2024
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal
  Generation and Cache Sharing
Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
Long Chen
DiffM
VGen
90
0
0
25 Nov 2024
1234
Next