ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.01717
  4. Cited By
Towards Accurate Generative Models of Video: A New Metric & Challenges
v1v2 (latest)

Towards Accurate Generative Models of Video: A New Metric & Challenges

3 December 2018
Thomas Unterthiner
Sjoerd van Steenkiste
Karol Kurach
Raphaël Marinier
Marcin Michalski
Sylvain Gelly
    EGVMVGen
ArXiv (abs)PDFHTML

Papers citing "Towards Accurate Generative Models of Video: A New Metric & Challenges"

50 / 715 papers shown
LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications
LiDAR-BIND-T: Improved and Temporally Consistent Sensor Modality Translation and Fusion for Robotic Applications
Niels Balemans
Ali Anwar
Jan Steckel
Siegfried Mercelis
293
0
0
06 Sep 2025
Human Motion Video Generation: A Survey
Human Motion Video Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Haiwei Xue
Xiangyang Luo
Zhanghao Hu
Shu Zhang
Xunzhi Xiang
...
Fei Ma
Zhiyong Wu
Changpeng Yang
Zonghong Dai
Fei Richard Yu
EGVMVGen
233
24
0
04 Sep 2025
O-DisCo-Edit: Object Distortion Control for Unified Realistic Video Editing
O-DisCo-Edit: Object Distortion Control for Unified Realistic Video Editing
Yihao Chen
Junjie Wang
Lin Liu
Ruihang Chu
Xiaopeng Zhang
Qi Tian
Yujiu Yang
DiffMVGen
141
3
0
01 Sep 2025
Look Beyond: Two-Stage Scene View Generation via Panorama and Video Diffusion
Look Beyond: Two-Stage Scene View Generation via Panorama and Video Diffusion
Xueyang Kang
Zhengkang Xiang
Zezheng Zhang
Kourosh Khoshelham
DiffMVGen
127
0
0
31 Aug 2025
Attention of a Kiss: Exploring Attention Maps in Video Diffusion for XAIxArts
Attention of a Kiss: Exploring Attention Maps in Video Diffusion for XAIxArts
Adam Cole
Mick Grierson
VGen
165
0
0
30 Aug 2025
ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
Ying Li
Xiaobao Wei
Yatian Wang
Yuming Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
Shanghang Zhang
VGen
351
9
0
29 Aug 2025
InfinityHuman: Towards Long-Term Audio-Driven Human
InfinityHuman: Towards Long-Term Audio-Driven Human
X. Li
Pan Xie
Yi Ren
Qijun Gan
Chen Zhang
Fangyuan Kong
Xiang Yin
Bingyue Peng
Zehuan Yuan
VGen
134
4
0
27 Aug 2025
Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation
Lightning Fast Caching-based Parallel Denoising Prediction for Accelerating Talking Head Generation
Jianzhi Long
Wenhao Sun
Rongcheng Tu
Dacheng Tao
DiffMVGen
173
0
0
25 Aug 2025
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
Guanxing Lu
Baoxiong Jia
Puhao Li
Yixin Chen
Ziwei Wang
Yansong Tang
Siyuan Huang
3DGS
203
8
0
25 Aug 2025
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Seeing Clearly, Forgetting Deeply: Revisiting Fine-Tuned Video Generators for Driving Simulation
Chun-Peng Chang
Chen-Yu Wang
Julian Schmidt
Holger Caesar
A. Pagani
VGen
242
1
0
22 Aug 2025
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation
CineScale: Free Lunch in High-Resolution Cinematic Visual Generation
Haonan Qiu
Ning Yu
Ziqi Huang
P. Debevec
Ziwei Liu
VGen
166
3
0
21 Aug 2025
Diverse Signer Avatars with Manual and Non-Manual Feature Modelling for Sign Language Production
Diverse Signer Avatars with Manual and Non-Manual Feature Modelling for Sign Language Production
Mohamed Ilyes Lakhal
Richard Bowden
DiffM
151
0
0
21 Aug 2025
MoVieDrive: Multi-Modal Multi-View Urban Scene Video Generation
MoVieDrive: Multi-Modal Multi-View Urban Scene Video Generation
Guile Wu
David Huang
Dongfeng Bai
Bingbing Liu
VGen
131
0
0
20 Aug 2025
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
Shunian Chen
Hejin Huang
Yexin Liu
Zihan Ye
Kai Chen
...
Junying Chen
Guanbin Li
Ser-Nam Lim
Harry Yang
Benyou Wang
EGVMVGen
120
2
0
19 Aug 2025
EgoTwin: Dreaming Body and View in First Person
EgoTwin: Dreaming Body and View in First Person
Jingqiao Xiu
Fangzhou Hong
Yicong Li
Mengze Li
Wentao Wang
Sirui Han
Liang Pan
Ziwei Liu
DiffMVGen
154
4
0
18 Aug 2025
Versatile Video Tokenization with Generative 2D Gaussian Splatting
Versatile Video Tokenization with Generative 2D Gaussian Splatting
Zhenghao Chen
Zicong Chen
Lei Liu
Yiming Wu
Dong Xu
3DGS
136
0
0
15 Aug 2025
Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
Animate-X++: Universal Character Image Animation with Dynamic Backgrounds
Shuai Tan
Biao Gong
Zhuoxin Liu
Yan Wang
Xi Chen
Yifan Feng
Hengshuang Zhao
VGen
260
2
0
13 Aug 2025
Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos
Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos
Chaoyi Wang
Yifan Yang
Jun Pei
Lijie Xia
Jianpo Liu
Xiaobing Yuan
Xinhan Di
VGen
92
0
0
12 Aug 2025
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation
Fangyuan Mao
Aiming Hao
Jintao Chen
Dongxia Liu
Xiaokun Feng
Jiashu Zhu
Meiqi Wu
Chubin Chen
Jiahong Wu
Xiangxiang Chu
DiffMVGen
340
9
0
11 Aug 2025
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
S. Tu
Yueming Pan
Y. Huang
Xintong Han
Zhen Xing
Jingdong Sun
Chong Luo
Zuxuan Wu
Yu-Gang Jiang
VGen
164
15
0
11 Aug 2025
ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
ShoulderShot: Generating Over-the-Shoulder Dialogue Videos
Yuang Zhang
Junqi Cheng
Haoyu Zhao
Jiaxi Gu
Fangyuan Zou
Zenghui Lu
Peng Shu
VGen
221
2
0
11 Aug 2025
PoseGen: In-Context LoRA Finetuning for Pose-Controllable Long Human Video Generation
PoseGen: In-Context LoRA Finetuning for Pose-Controllable Long Human Video Generation
Jingxuan He
Busheng Su
Finn Wong
DiffMVGen
118
2
0
07 Aug 2025
LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation
LayerT2V: Interactive Multi-Object Trajectory Layering for Video Generation
Kangrui Cen
Baixuan Zhao
Yi Xin
Siqi Luo
Guoquan Zheng
Xiaohong Liu
DiffMVGen
144
0
0
06 Aug 2025
Scaling Up Audio-Synchronized Visual Animation: An Efficient Training Paradigm
Scaling Up Audio-Synchronized Visual Animation: An Efficient Training Paradigm
Lin Zhang
Zefan Cai
Jiuxiang Gu
Shentong Mo
Jinhong Lin
...
Ruiyi Zhang
Wen Xiao
Tong Sun
Junjie Hu
Pedro Morgado
VGen
168
1
0
05 Aug 2025
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
QuaDreamer: Controllable Panoramic Video Generation for Quadruped Robots
Sheng Wu
Fei Teng
Hao Shi
Qi Jiang
Kai Luo
Kaiwei Wang
Kailun Yang
VGen
251
1
0
04 Aug 2025
PoseGuard: Pose-Guided Generation with Safety Guardrails
PoseGuard: Pose-Guided Generation with Safety Guardrails
Kongxin Wang
Jie Zhang
Peigui Qi
Kunsheng Tang
Tianwei Zhang
Wenbo Zhou
81
0
0
04 Aug 2025
Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering
Text2Lip: Progressive Lip-Synced Talking Face Generation from Text via Viseme-Guided Rendering
Xu Wang
Shengeng Tang
Fei Wang
L. T. Cheng
Dan Guo
Feng Xue
Richang Hong
117
2
0
04 Aug 2025
TITAN-Guide: Taming Inference-Time AligNment for Guided Text-to-Video Diffusion Models
TITAN-Guide: Taming Inference-Time AligNment for Guided Text-to-Video Diffusion Models
Christian Simon
Masato Ishii
Akio Hayakawa
Zhi-Wei Zhong
Shusuke Takahashi
Takashi Shibuya
Yuki Mitsufuji
130
1
0
01 Aug 2025
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation
SpA2V: Harnessing Spatial Auditory Cues for Audio-driven Spatially-aware Video Generation
K. T. Pham
Yingqing He
Yazhou Xing
Qifeng Chen
L. Chen
DiffMVGen
1.1K
1
0
01 Aug 2025
Compositional Video Synthesis by Temporal Object-Centric Learning
Compositional Video Synthesis by Temporal Object-Centric Learning
Adil Kaan Akan
Yucel Yemez
DiffMOCL
234
0
0
28 Jul 2025
JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1
JWB-DH-V1: Benchmark for Joint Whole-Body Talking Avatar and Speech Generation Version 1
Xinhan Di
Kristin Qi
Pengqian Yu
DiffMVGen
215
0
0
28 Jul 2025
ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
Xuanchen Wang
Heng Wang
Weidong (Tom) Cai
225
3
0
26 Jul 2025
MVG4D: Image Matrix-Based Multi-View and Motion Generation for 4D Content Creation from a Single Image
MVG4D: Image Matrix-Based Multi-View and Motion Generation for 4D Content Creation from a Single Image
Xiaotian Chen
DongFu Yin
Fei Richard Yu
Xuanchen Li
Xinhao Zhang
3DGS
271
0
0
24 Jul 2025
AirScape: An Aerial Generative World Model with Motion Controllability
AirScape: An Aerial Generative World Model with Motion Controllability
Baining Zhao
Rongze Tang
Mingyuan Jia
Ziyou Wang
Fanghang Man
...
W. Zhang
Wei Wu
Chen Gao
Xinlei Chen
Yong Li
VGen
173
3
0
10 Jul 2025
EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation
EchoMimicV3: 1.3B Parameters are All You Need for Unified Multi-Modal and Multi-Task Human Animation
Rang Meng
Y. Wang
Weipeng Wu
Ruobing Zheng
Yuming Li
Chenguang Ma
VGen3DH
284
14
0
05 Jul 2025
HumanGif: Single-View Human Diffusion with Generative Prior
HumanGif: Single-View Human Diffusion with Generative Prior
Shoukang Hu
Takuya Narihira
Kazumi Fukuda
Ryosuke Sawata
Takashi Shibuya
Yuki Mitsufuji
526
5
0
01 Jul 2025
LatentMove: Towards Complex Human Movement Video Generation
LatentMove: Towards Complex Human Movement Video Generation
Ashkan Taghipour
Morteza Ghahremani
Mohammed Bennamoun
F. Boussaïd
Aref Miri Rekavandi
Zinuo Li
Qiuhong Ke
Hamid Laga
3DHVGen
275
1
0
01 Jul 2025
Adapting Vision-Language Models for Evaluating World Models
Adapting Vision-Language Models for Evaluating World Models
Mariya Hendriksen
Tabish Rashid
David Bignell
Raluca Georgescu
Abdelhak Lemkhenter
Katja Hofmann
Sam Devlin
Sarah Parisot
188
0
0
22 Jun 2025
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization
Cong Wang
Zexuan Deng
Zhiwei Jiang
Fei Shen
Yafeng Yin
Shiwei Gan
Zifeng Cheng
Shiwei Gan
Qing Gu
DiffMSLRVGen
398
3
0
19 Jun 2025
STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation
STAGE: A Stream-Centric Generative World Model for Long-Horizon Driving-Scene Simulation
Jiamin Wang
Yichen Yao
Xiang Feng
Hang Wu
Yaming Wang
Qingqiu Huang
Y. Ma
Xinge Zhu
VGen
317
3
0
16 Jun 2025
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Foundation Models in Autonomous Driving: A Survey on Scenario Generation and Scenario Analysis
Yuan Gao
Mattia Piccinini
Yuchen Zhang
Dingrui Wang
Korbinian Moller
...
Steven Peters
Andrea Stocco
Bassam Alrifaee
Marco Pavone
Johannes Betz
347
19
0
13 Jun 2025
Rethinking Generative Human Video Coding with Implicit Motion Transformation
Rethinking Generative Human Video Coding with Implicit Motion Transformation
Bolin Chen
Ru-Ling Liao
Jie Chen
Yan Ye
DiffMVGen
256
2
0
12 Jun 2025
HunyuanVideo-HOMA: Generic Human-Object Interaction in Multimodal Driven Human Animation
Ziyao Huang
Zixiang Zhou
Juan Cao
Yifeng Ma
Yi Chen
...
Hongmei Wang
Qin Lin
Yuan Zhou
Qinglin Lu
Fan Tang
VGen
221
4
0
10 Jun 2025
From Pixels to Graphs: using Scene and Knowledge Graphs for HD-EPIC VQA Challenge
Agnese Taluzzi
Davide Gesualdi
Riccardo Santambrogio
Chiara Plizzari
Francesca Palermo
S. Mentasti
Matteo Matteucci
GNN
290
2
0
10 Jun 2025
EgoM2P: Egocentric Multimodal Multitask Pretraining
EgoM2P: Egocentric Multimodal Multitask Pretraining
Gen Li
Yutong Chen
Yiqian Wu
Kaifeng Zhao
Marc Pollefeys
Siyu Tang
EgoVVLM
408
4
0
09 Jun 2025
Audio-Sync Video Generation with Multi-Stream Temporal Control
Audio-Sync Video Generation with Multi-Stream Temporal Control
Shuchen Weng
Haojie Zheng
Zheng Chang
Si Li
Boxin Shi
Xinlong Wang
DiffMVGen
205
4
0
09 Jun 2025
FEAT: Full-Dimensional Efficient Attention Transformer for Medical Video GenerationInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Huihan Wang
Zhiwen Yang
Hui Zhang
Dan Zhao
Bingzheng Wei
Yan Xu
MedImViT
254
0
0
05 Jun 2025
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models
Ziyi Wu
Vidit Goel
Ivan Skorokhodov
Willi Menapace
Ashkan Mirzaei
Igor Gilitschenski
Sergey Tulyakov
Aliaksandr Siarohin
DiffMVGen
389
11
0
04 Jun 2025
SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis
SG2VID: Scene Graphs Enable Fine-Grained Control for Video SynthesisInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025
Ssharvien Kumar Sivakumar
Yannik Frisch
Ghazal Ghazaei
Anirban Mukhopadhyay
VGen
292
1
0
03 Jun 2025
CamCloneMaster: Enabling Reference-based Camera Control for Video Generation
CamCloneMaster: Enabling Reference-based Camera Control for Video Generation
Yawen Luo
J. Bai
Xiaoyu Shi
Menghan Xia
Xintao Wang
Pengfei Wan
Di Zhang
Kun Gai
Tianfan Xue
DiffMVGen
205
10
0
03 Jun 2025
Previous
123456...131415
Next