ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.01717
  4. Cited By
Towards Accurate Generative Models of Video: A New Metric & Challenges
v1v2 (latest)

Towards Accurate Generative Models of Video: A New Metric & Challenges

3 December 2018
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
    EGVMVGen
ArXiv (abs)PDFHTML

Papers citing "Towards Accurate Generative Models of Video: A New Metric & Challenges"

50 / 715 papers shown
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
DiffuseSlide: Training-Free High Frame Rate Video Generation Diffusion
Geunmin Hwang
Hyun-kyu Ko
Younghyun Kim
S. W. Lee
Eunbyung Park
VGen
226
1
0
02 Jun 2025
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control
Xiao Fu
Xintao Wang
Xian Liu
Jianhong Bai
R. Xu
Pengfei Wan
Di Zhang
Dahua Lin
VGen
254
13
0
02 Jun 2025
Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction
Towards a Generalizable Bimanual Foundation Policy via Flow-based Video Prediction
Chenyou Fan
Fangzheng Yan
Fuchun Sun
Jiepeng Wang
Fangqiu Yi
Zhen Wang
Xuelong Li
VGen
1.1K
2
0
30 May 2025
DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds
DreamDance: Animating Character Art via Inpainting Stable Gaussian Worlds
Jiaxu Zhang
Xianfang Zeng
Xin Chen
W. Zuo
Gang Yu
Guosheng Lin
Zhigang Tu
DiffM3DGSVGen
201
0
0
30 May 2025
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization
Jiahao Cui
Yan Chen
Mingwang Xu
Hanlin Shang
Yuxuan Chen
Yun Zhan
Zilong Dong
Yao Yao
Jingdong Wang
Siyu Zhu
DiffMVGen
539
8
0
29 May 2025
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
Siyuan Wang
Jiawei Liu
Wei Wang
Yeying Jin
Jinsong Du
Zhi Han
SLRVGen
250
0
0
29 May 2025
Toward Memory-Aided World Models: Benchmarking via Spatial Consistency
Toward Memory-Aided World Models: Benchmarking via Spatial Consistency
Kewei Lian
Shaofei Cai
Yilun Du
Yitao Liang
235
1
0
29 May 2025
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
GeoDrive: 3D Geometry-Informed Driving World Model with Precise Action Control
Anthony Chen
Wenzhao Zheng
Yida Wang
Xueyang Zhang
Kun Zhan
Fu Liu
Kurt Keutzer
Shanghang Zhang
353
8
0
28 May 2025
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
PanoWan: Lifting Diffusion Video Generation Models to 360° with Latitude/Longitude-aware Mechanisms
Yifei Xia
Shuchen Weng
Siqi Yang
Jingqi Liu
Chengxuan Zhu
Minggui Teng
Zijian Jia
Han Jiang
Boxin Shi
DiffMVGen
301
4
0
28 May 2025
Assessing the Use of Face Swapping Methods as Face Anonymizers in Videos
Assessing the Use of Face Swapping Methods as Face Anonymizers in VideosInternational Conference on Digital Signal Processing (DSP), 2025
Mustafa İzzet Muştu
Hazım Kemal Ekenel
PICVCVBM
407
0
0
27 May 2025
Unified Text-Image-to-Video Generation: A Training-Free Approach to Flexible Visual Conditioning
Unified Text-Image-to-Video Generation: A Training-Free Approach to Flexible Visual Conditioning
Bolin Lai
Sangmin Lee
Xu Cao
Xiang Li
James M. Rehg
DiffM
274
0
0
27 May 2025
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Frame In-N-Out: Unbounded Controllable Image-to-Video Generation
Boyang Wang
Xuweiyi Chen
Matheus Gadelha
Zezhou Cheng
DiffMVGen
377
5
0
27 May 2025
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
AniCrafter: Customizing Realistic Human-Centric Animation via Avatar-Background Conditioning in Video Diffusion Models
Muyao Niu
Mingdeng Cao
Yifan Zhan
Qingtian Zhu
Mingze Ma
Jiancheng Zhao
Yanhong Zeng
Zhihang Zhong
Xiao Sun
Yinqiang Zheng
DiffMVGen
322
6
0
26 May 2025
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos
ProphetDWM: A Driving World Model for Rolling Out Future Actions and Videos
Xiaodong Wang
Peixi Peng
VGen
1.3K
1
0
24 May 2025
SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Jiawei Zhou
Linye Lyu
Zhuotao Tian
Cheng Zhuo
Yu Li
VGen
190
3
0
23 May 2025
Temporal Differential Fields for 4D Motion Modeling via Image-to-Video Synthesis
Temporal Differential Fields for 4D Motion Modeling via Image-to-Video SynthesisInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Xin You
Minghui Zhang
Hanxiao Zhang
J. Yang
Nassir Navab
DiffMVGenMedIm
567
3
0
22 May 2025
Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction
Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose InteractionComputer Vision and Pattern Recognition (CVPR), 2025
Dong Li
Wenqi Zhong
Wei Yu
Yingwei Pan
Xiaoxu Feng
Ting Yao
Junwei Han
Tao Mei
DiffMVGen
256
4
0
22 May 2025
Consistent World Models via Foresight Diffusion
Consistent World Models via Foresight Diffusion
Yu Zhang
Xingzhuo Guo
Haoran Xu
Mingsheng Long
229
1
0
22 May 2025
Vid2World: Crafting Video Diffusion Models to Interactive World Models
Vid2World: Crafting Video Diffusion Models to Interactive World Models
Siqiao Huang
Jialong Wu
Qixing Zhou
Shangchen Miao
Mingsheng Long
VGen
331
12
0
20 May 2025
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance
FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal GuidanceComputer Vision and Pattern Recognition (CVPR), 2025
Dian Shao
Mingfei Shi
Shengda Xu
Haodong Chen
Yongle Huang
Binglu Wang
3DH
372
6
0
19 May 2025
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
Safe-Sora: Safe Text-to-Video Generation via Graphical Watermarking
Zihan Su
Xuerui Qiu
Hongbin Xu
Tangyu Jiang
Junhao Zhuang
Chun Yuan
Ming Li
Shengfeng He
Fei Richard Yu
WIGM
400
1
0
19 May 2025
Building spatial world models from sparse transitional episodic memories
Building spatial world models from sparse transitional episodic memories
Zizhan He
Maxime Daigle
Pouya Bashivan
KELM
237
0
0
19 May 2025
Video-GPT via Next Clip Diffusion
Video-GPT via Next Clip Diffusion
Shaobin Zhuang
Zhipeng Huang
Ying Zhang
Fangyikang Wang
Canmiao Fu
Binxin Yang
Chong Sun
Chen Li
Yali Wang
DiffMVGen
620
5
0
18 May 2025
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
LOVE: Benchmarking and Evaluating Text-to-Video Generation and Video-to-Text Interpretation
Jiarui Wang
Huiyu Duan
Ziheng Jia
Yu Zhao
Woo Yi Yang
...
Zhongfu Chen
Juntong Wang
Yuke Xing
Guangtao Zhai
Xiongkuo Min
VGen
452
5
0
17 May 2025
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
MTVCrafter: 4D Motion Tokenization for Open-World Human Image Animation
Yanbo Ding
Xirui Hu
Zhizhi Guo
Longji Xu
Yali Wang
DiffMVGen
803
0
0
15 May 2025
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
Jun Guo
Xiaojian Ma
Yikai Wang
Min Yang
Huaping Liu
Qing Li
VGen
288
8
0
15 May 2025
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation Model
Ophora: A Large-Scale Data-Driven Text-Guided Ophthalmic Surgical Video Generation ModelInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Wei Li
Ming Hu
Guoan Wang
Lihao Liu
Kaijin Zhou
Junzhi Ning
Xin Guo
Zongyuan Ge
Lixu Gu
Junjun He
529
3
0
12 May 2025
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
ProFashion: Prototype-guided Fashion Video Generation with Multiple Reference Images
Xianghao Kong
Qiaosong Qi
Yuanbin Wang
Anyi Rao
Biaolong Chen
Aixi Zhang
Si Liu
Hao Jiang
DiffMVGen
239
4
0
10 May 2025
A Unit Enhancement and Guidance Framework for Audio-Driven Avatar Video Generation
A Unit Enhancement and Guidance Framework for Audio-Driven Avatar Video Generation
Y.B. Wang
S.Z. Zhou
J.F. Wu
T. Hu
J.N. Zhang
DiffMVGen
577
0
0
06 May 2025
Learning 3D Persistent Embodied World Models
Learning 3D Persistent Embodied World Models
Siyuan Zhou
Yilun Du
Yuncong Yang
Lei Han
Peihao Chen
Dit-Yan Yeung
Chuang Gan
VGen
380
15
0
05 May 2025
Direct Motion Models for Assessing Generated Videos
Direct Motion Models for Assessing Generated Videos
Kelsey R. Allen
Carl Doersch
Guangyao Zhou
Mohammed Suhail
Danny Driess
...
Thomas Kipf
Mehdi S. M. Sajjadi
Kevin P. Murphy
João Carreira
Sjoerd van Steenkiste
EGVMDiffMVGen
487
5
0
30 Apr 2025
ReVision: Refining Video Diffusion with Explicit 3D Motion Modeling
ReVision: Refining Video Diffusion with Explicit 3D Motion Modeling
Qihao Liu
Ju He
Qihang Yu
Liang-Chieh Chen
Alan Yuille
DiffMVGen
510
5
0
30 Apr 2025
MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance
MagicPortrait: Temporally Consistent Face Reenactment with 3D Geometric Guidance
Mengting Wei
Yante Li
Tuomas Varanka
Yan Jiang
Guoying Zhao
DiffMVGen
519
1
0
30 Apr 2025
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
DiVE: Efficient Multi-View Driving Scenes Generation Based on Video Diffusion Transformer
Junpeng Jiang
Gangyi Hong
Miao Zhang
Hengtong Hu
Kun Zhan
Rui Shao
Liqiang Nie
VGen
268
4
0
28 Apr 2025
IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular Videos
IM-Portrait: Learning 3D-aware Video Diffusion for Photorealistic Talking Heads from Monocular VideosComputer Vision and Pattern Recognition (CVPR), 2025
Yuan Li
Ziqian Bai
Feitong Tan
Zhaopeng Cui
S. Fanello
Yinda Zhang
DiffMVGen
313
1
0
27 Apr 2025
NoiseController: Towards Consistent Multi-view Video Generation via Noise Decomposition and Collaboration
NoiseController: Towards Consistent Multi-view Video Generation via Noise Decomposition and Collaboration
Haotian Dong
Xinze Wang
Dahua Lin
Yipeng Wu
Qin Chen
R. Liu
Kairui Yang
Ping Li
Qing Guo
VGen
247
1
0
25 Apr 2025
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Ying Li
Xiaobao Wei
Yatian Wang
Yiming Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
VGen
326
7
0
23 Apr 2025
Solving New Tasks by Adapting Internet Video Knowledge
Solving New Tasks by Adapting Internet Video KnowledgeInternational Conference on Learning Representations (ICLR), 2025
Calvin Luo
Zilai Zeng
Yilun Du
Chen Sun
235
12
0
21 Apr 2025
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
FlowLoss: Dynamic Flow-Conditioned Loss Strategy for Video Diffusion Models
Kuanting Wu
Kei Ota
Asako Kanezaki
DiffMVGen
346
0
0
20 Apr 2025
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos
EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric VideosInternational Conference on Learning Representations (ICLR), 2025
Jinfeng Xu
Yuanmin Huang
Baoqi Pei
Junlin Hou
Qingqiu Li
Guo Chen
Yuhui Zhang
Rui Feng
Weidi Xie
DiffM
277
16
0
16 Apr 2025
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
VideoPanda: Video Panoramic Diffusion with Multi-view Attention
Kevin Xie
Amirmojtaba Sabour
Jiahui Huang
Despoina Paschalidou
G. Klár
Umar Iqbal
Sanja Fidler
Fangyin Wei
VGenMDE
407
4
0
15 Apr 2025
Taming Consistency Distillation for Accelerated Human Image Animation
Taming Consistency Distillation for Accelerated Human Image Animation
Xinyu Wang
Shiwei Zhang
Hangjie Yuan
Yujie Wei
Yuanxing Zhang
Changxin Gao
Yuehuan Wang
Nong Sang
VGen
332
1
0
15 Apr 2025
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
InterAnimate: Taming Region-aware Diffusion Model for Realistic Human Interaction Animation
Yukang Lin
Y. Hong
Zunnan Xu
Xiaochen Li
Chao Xu
...
Jun Lan
Huijia Zhu
Weiqiang Wang
Jianfu Zhang
Xiu Li
VGen
318
1
0
15 Apr 2025
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting
Jiaxin Huang
Sheng Miao
BangBnag Yang
Yuewen Ma
Yiyi Liao
VGenMDE
616
2
0
15 Apr 2025
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
On Equivariance and Fast Sampling in Video Diffusion Models Trained with Warped Noise
Chao Liu
Arash Vahdat
DiffMVGen
384
4
0
14 Apr 2025
H-MoRe: Learning Human-centric Motion Representation for Action Analysis
H-MoRe: Learning Human-centric Motion Representation for Action AnalysisComputer Vision and Pattern Recognition (CVPR), 2025
Zhanbo Huang
Xiaoming Liu
Yu Kong
3DH
285
4
0
14 Apr 2025
Aligning Anime Video Generation with Human Feedback
Aligning Anime Video Generation with Human Feedback
Bingwen Zhu
Yudong Jiang
Baohan Xu
Siqian Yang
Mingyu Yin
Yidi Wu
Huyang Sun
Zuxuan Wu
EGVMVGen
387
4
0
14 Apr 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Liang Luo
Xiaodong Yu
Jialian Wu
Xingwu Sun
Yusheng Su
Yaoyao Liu
Zicheng Liu
Emad Barsoum
DiffMVGen
281
4
0
13 Apr 2025
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft
Junliang Guo
Yang Ye
Tianyu He
Haoyu Wu
Yushu Jiang
Tim Pearce
Li Zhao
VGenSyDa
316
37
0
11 Apr 2025
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video GenerationComputer Vision and Pattern Recognition (CVPR), 2025
Ruineng Li
Daitao Xing
Huiming Sun
Yuanzhou Ha
Jinglin Shen
C. Ho
DiffMVGen
287
4
0
11 Apr 2025
Previous
12345...131415
Next