Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1812.01717
Cited By
v1
v2 (latest)
Towards Accurate Generative Models of Video: A New Metric & Challenges
3 December 2018
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
EGVM
VGen
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Towards Accurate Generative Models of Video: A New Metric & Challenges"
50 / 715 papers shown
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffM
VGen
289
5
0
11 Apr 2025
EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model
Renda Li
Xiaohua Qi
Q. Ling
Jun Yu
Ziyi Chen
Peng Chang
Mei HanJing Xiao
DiffM
VGen
275
0
0
11 Apr 2025
IGG: Image Generation Informed by Geodesic Dynamics in Deformation Spaces
Information Processing in Medical Imaging (IPMI), 2025
Nian Wu
Nivetha Jayakumar
Jiarui Xing
Miaomiao Zhang
352
1
0
09 Apr 2025
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
International Conference on Multimedia Retrieval (ICMR), 2025
E. Peruzzo
Dejia Xu
Xingqian Xu
Humphrey Shi
Andrii Zadaianchuk
DiffM
VGen
335
3
0
09 Apr 2025
DyDiT++: Diffusion Transformers with Timestep and Spatial Dynamics for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Xiaojiang Peng
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
593
3
0
09 Apr 2025
Video-Bench: Human-Aligned Video Generation Benchmark
Computer Vision and Pattern Recognition (CVPR), 2025
Hui Han
Siyuan Li
Jiaqi Chen
Yiwen Yuan
Yuling Wu
...
You Li
Jing Zhang
Chi Zhang
Li Li
Yongxin Ni
EGVM
VGen
587
16
0
07 Apr 2025
Can You Count to Nine? A Human Evaluation Benchmark for Counting Limits in Modern Text-to-Video Models
Xuyang Guo
Zekai Huang
Jiayan Huo
Yingyu Liang
Zhenmei Shi
Zhao Song
Jiahao Zhang
ALM
VGen
508
13
0
05 Apr 2025
Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation
Fa-Ting Hong
Zunnan Xu
Zixiang Zhou
Zhiqiang Zhang
Xiu Li
Qin Lin
Qinglin Lu
D. Xu
DiffM
VGen
510
10
0
03 Apr 2025
CoGen: 3D Consistent Video Generation via Adaptive Conditioning for Autonomous Driving
Yishen Ji
Ziyue Zhu
Zhenxin Zhu
Kaixin Xiong
Jiaying Ying
Zhiqi Li
Lijun Zhou
Haiyang Sun
Bing Wang
Tong Lu
VGen
286
5
0
28 Mar 2025
Exploring the Evolution of Physics Cognition in Video Generation: A Survey
Minghui Lin
Xiang Wang
Longji Xu
Shu Wang
Fengqi Dai
...
Cunxiang Wang
Zhengrong Zuo
Nong Sang
Siteng Huang
Donglin Wang
EGVM
VGen
377
23
0
27 Mar 2025
DynamiCtrl: Rethinking the Basic Structure and the Role of Text for High-quality Human Image Animation
Haoyu Zhao
Zhongang Qi
Cong Wang
Qingping Zheng
Guansong Lu
Fei Chen
Hang Xu
Zuxuan Wu
DiffM
VGen
299
2
0
27 Mar 2025
AudCast: Audio-Driven Human Video Generation by Cascaded Diffusion Transformers
Computer Vision and Pattern Recognition (CVPR), 2025
Jiazhi Guan
Kaisiyuan Wang
Zhiliang Xu
Quanwei Yang
Yasheng Sun
...
Errui Ding
Jiadong Wang
Youjian Zhao
Hang Zhou
Ziwei Liu
VGen
288
3
0
25 Mar 2025
MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation
Computer Vision and Pattern Recognition (CVPR), 2025
Yukang Lin
Hokit Fung
Jianjin Xu
Zeping Ren
Adela S.M. Lau
Guosheng Yin
Xiu Li
VGen
304
12
0
25 Mar 2025
AdaWorld: Learning Adaptable World Models with Latent Actions
Shenyuan Gao
Siyuan Zhou
Yilun Du
Jun Zhang
Chuang Gan
VGen
574
36
0
24 Mar 2025
EvAnimate: Event-conditioned Image-to-Video Generation for Human Animation
Qiang Qu
Ming Li
Xiaoming Chen
Tongliang Liu
DiffM
VGen
340
3
0
24 Mar 2025
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
Computer Vision and Pattern Recognition (CVPR), 2025
Dingcheng Zhen
Shunshun Yin
Shiyang Qin
Hou Yi
Ziwei Zhang
Siyuan Liu
Gan Qi
Ming Tao
VGen
261
10
0
24 Mar 2025
RDTF: Resource-efficient Dual-mask Training Framework for Multi-frame Animated Sticker Generation
Zhiqiang Yuan
Ting Zhang
Ying Deng
Jiapei Zhang
Yeshuang Zhu
Zexi Jia
Jie Zhou
Jinchao Zhang
VGen
243
2
0
22 Mar 2025
Enabling Versatile Controls for Video Diffusion Models
Xu Zhang
Hao Zhou
Haoming Qin
Xiaobin Lu
Jiaxing Yan
Guanzhong Wang
Zeyu Chen
Yi Liu
DiffM
VGen
272
4
0
21 Mar 2025
PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
Computer Vision and Pattern Recognition (CVPR), 2025
Longbin Ji
Lei Zhong
Pengfei Wei
Changjian Li
DiffM
VGen
270
3
0
20 Mar 2025
MiLA: Multi-view Intensive-fidelity Long-term Video Generation World Model for Autonomous Driving
Haiguang Wang
Daqi Liu
Hongwei Xie
Haisong Liu
Enhui Ma
Kaicheng Yu
Limin Wang
Bing Wang
VGen
305
5
0
20 Mar 2025
SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation
Chun-Han Yao
Yiming Xie
Vikram S. Voleti
Huaizu Jiang
Varun Jampani
3DGS
VGen
543
27
0
20 Mar 2025
Ultrasound Image-to-Video Synthesis via Latent Dynamic Diffusion Models
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Tingxiu Chen
Yilei Shi
Zixuan Zheng
Bingcong Yan
Jingliang Hu
Xiao Xiang Zhu
Lichao Mou
VGen
MedIm
313
15
0
19 Mar 2025
Temporal Regularization Makes Your Video Generator Stronger
Harold Haodong Chen
Haojian Huang
Xianfeng Wu
Yexin Liu
Yajing Bai
Wen-Jie Shu
Harry Yang
Ser-Nam Lim
VGen
379
8
0
19 Mar 2025
Fast Autoregressive Video Generation with Diagonal Decoding
Yang Ye
Junliang Guo
Haoyu Wu
Tianyu He
Tim Pearce
Tabish Rashid
Katja Hofmann
Li Zhao
DiffM
VGen
260
4
0
18 Mar 2025
Generative Gaussian Splatting: Generating 3D Scenes with Video Diffusion Priors
Katja Schwarz
Norman Mueller
Peter Kontschieder
3DGS
301
11
0
17 Mar 2025
AUTV: Creating Underwater Video Datasets with Pixel-wise Annotations
Quang-Trung Truong
Wong Yuk Kwan
Duc Thanh Nguyen
Binh-Son Hua
Sai-Kit Yeung
VGen
318
1
0
17 Mar 2025
EQ-TAA: Equivariant Traffic Accident Anticipation via Diffusion-Based Accident Video Synthesis
Jianwu Fang
Lei-lei Li
Zhedong Zheng
Hongkai Yu
Jianru Xue
Zhengguo Li
Tat-Seng Chua
244
0
0
16 Mar 2025
TACO: Taming Diffusion for in-the-wild Video Amodal Completion
Ruijie Lu
Yixin Chen
Yu Liu
Jiaxiang Tang
Junfeng Ni
Diwen Wan
Gang Zeng
Siyuan Huang
DiffM
VGen
468
9
0
15 Mar 2025
RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing
Tianrui Pan
Lin Liu
Jie Liu
Xinsong Zhang
J. Tang
Gangshan Wu
Q. Tian
DiffM
VGen
300
0
0
14 Mar 2025
HiTVideo: Hierarchical Tokenizers for Enhancing Text-to-Video Generation with Autoregressive Large Language Models
Ziqin Zhou
Yifan Yang
Yue Yang
Tianyu He
Houwen Peng
Kai Qiu
Qi Dai
Lili Qiu
Chong Luo
Lingqiao Liu
DiffM
VGen
183
5
0
14 Mar 2025
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
Yasheng Sun
Zhiliang Xu
Hang Zhou
Jiazhi Guan
Quanwei Yang
...
Yingying Li
Haocheng Feng
Jiadong Wang
Ziwei Liu
Koike Hideki
VGen
352
3
0
13 Mar 2025
Inter-environmental world modeling for continuous and compositional dynamics
Kunihiko Miyoshi
Masanori Koyama
Julian Jorge Andrade Guerreiro
KELM
306
0
0
13 Mar 2025
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
Hao He
Ceyuan Yang
Shanchuan Lin
Yinghao Xu
Meng Wei
Liangke Gui
Qi Zhao
Gordon Wetzstein
Lu Jiang
Hongsheng Li
DiffM
VGen
391
42
0
13 Mar 2025
Neighboring Autoregressive Modeling for Efficient Visual Generation
Yefei He
Yuanyu He
Shaoxuan He
Feng Chen
Hong Zhou
Jianchao Tan
Bohan Zhuang
326
17
0
12 Mar 2025
Other Vehicle Trajectories Are Also Needed: A Driving World Model Unifies Ego-Other Vehicle Trajectories in Video Latent Space
Jian Zhu
Zhengyu Jia
Tian Gao
Jiaxin Deng
Shidi Li
Han Li
Fu Liu
Xianpeng Lang
Xiaolong Sun
VGen
975
4
0
12 Mar 2025
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
Hyeonho Jeong
Suhyeon Lee
Jong Chul Ye
VGen
1.2K
11
0
12 Mar 2025
PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop
Chenyu Li
Oscar Michel
Xichen Pan
Sainan Liu
Mike Roberts
Saining Xie
VGen
229
25
0
12 Mar 2025
TPDiff: Temporal Pyramid Video Diffusion Model
L. Ran
Mike Zheng Shou
284
1
0
12 Mar 2025
V2M4: 4D Mesh Animation Reconstruction from a Single Monocular Video
Jianqi Chen
Biao Zhang
Xiangjun Tang
Peter Wonka
VGen
304
15
0
11 Mar 2025
REGEN: Learning Compact Video Embedding with (Re-)Generative Decoder
Yitian Zhang
Long Mai
Aniruddha Mahapatra
Yitian Zhang
Yicong Hong
Jonah Casebeer
Feng Liu
Y. Fu
DiffM
VGen
248
0
0
11 Mar 2025
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments
Soonwoo Kwon
Jin-Young Kim
Hyojun Go
Kyungjune Baek
284
2
0
11 Mar 2025
Versatile Multimodal Controls for Expressive Talking Human Animation
Zheng Qin
Ruobing Zheng
Yabing Wang
Tianqi Li
Zixin Zhu
Minghui Yang
Ming Yang
Le Wang
DiffM
VGen
330
0
0
10 Mar 2025
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
Computer Vision and Pattern Recognition (CVPR), 2025
Mingzhen Sun
Weining Wang
Gen Li
Jiawei Liu
Jiahui Sun
Wanquan Feng
Shanshan Lao
Siyu Zhou
Qian He
Qingbin Liu
DiffM
VGen
353
27
0
10 Mar 2025
LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation
Quanjian Song
Zhihang Lin
Zhanpeng Zeng
Ziyue Zhang
Liujuan Cao
Rongrong Ji
VGen
309
5
0
09 Mar 2025
VideoPhy-2: A Challenging Action-Centric Physical Commonsense Evaluation in Video Generation
Hritik Bansal
Clark Peng
Yonatan Bitton
Roman Goldenberg
Aditya Grover
Kai-Wei Chang
EGVM
VGen
305
42
0
09 Mar 2025
VACT: A Video Automatic Causal Testing System and a Benchmark
Haotong Yang
Qingyuan Zheng
Yunjian Gao
Yongkun Yang
Yangbo He
Zhouchen Lin
Muhan Zhang
VGen
CML
355
0
0
08 Mar 2025
Get In Video: Add Anything You Want to the Video
Shaobin Zhuang
Zhipeng Huang
Binxin Yang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Chong Sun
Zheng-Jun Zha
Chen Li
Yijiao Wang
DiffM
VGen
367
9
0
08 Mar 2025
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
Ziyi Yang
Fanqi Wan
Longguang Zhong
Canbin Huang
Guosheng Liang
Xiaojun Quan
MoMe
286
10
0
06 Mar 2025
DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Zhao Yang
Zezhong Qian
Xiaofan Li
Weixiang Xu
Gongpeng Zhao
Ruohong Yu
Lingsi Zhu
Longjun Liu
DiffM
VGen
368
4
0
05 Mar 2025
GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
Zhun Mou
Bin Xia
Zhengchao Huang
Wenming Yang
Jiaya Jia
VGen
ELM
LRM
318
4
0
04 Mar 2025
Previous
1
2
3
4
5
6
...
13
14
15
Next
Page 5 of 15
Page
of 15
Go