Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.05014
Cited By
MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators
7 April 2024
Shenghai Yuan
Jinfa Huang
Yujun Shi
Yongqi Xu
Ruijie Zhu
Bin Lin
Xinhua Cheng
Li-xin Yuan
Jiebo Luo
VGen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators"
32 / 32 papers shown
Title
CountDiffusion: Text-to-Image Synthesis with Training-Free Counting-Guidance Diffusion
Y. Li
Pencheng Wan
Liang Han
Yaowei Wang
Liqiang Nie
Min Zhang
31
0
0
07 May 2025
SkyReels-A2: Compose Anything in Video Diffusion Transformers
Zhengcong Fei
D. Li
Di Qiu
J. Wang
Yikun Dou
...
J. Xu
Mingyuan Fan
Guibin Chen
Yang Li
Yahui Zhou
DiffM
VGen
63
2
0
03 Apr 2025
HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation
Kun Liu
Qi Liu
Xinchen Liu
Jie Li
Yongdong Zhang
Jiebo Luo
Xiaodong He
Wu Liu
VGen
33
0
0
31 Mar 2025
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation
Qihui Zhang
Munan Ning
Zheyuan Liu
Yanbo Wang
Jiayi Ye
Yue Huang
Shuo Yang
Xiao Chen
Y. Song
Li Yuan
LRM
51
0
0
19 Mar 2025
MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation
Hongyu Zhang
Yufan Deng
Shenghai Yuan
Peng Jin
Zesen Cheng
Yian Zhao
Chang-Shu Liu
Jie Chen
DiffM
VGen
89
0
0
18 Mar 2025
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
Yufan Deng
Xun Guo
Y. Wang
Jacob Zhiyuan Fang
Angtian Wang
Shenghai Yuan
Yiding Yang
Bo Liu
Haibin Huang
Chongyang Ma
DiffM
VGen
62
0
0
13 Mar 2025
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang
Bin Zhu
Bin Lin
Mingzhe Zheng
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
VGen
3DH
63
2
0
30 Nov 2024
WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model
Zongjian Li
Bin Lin
Yang Ye
Liuhan Chen
Xinhua Cheng
Shenghai Yuan
Li-xin Yuan
VGen
DiffM
104
16
0
26 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
114
1
0
25 Nov 2024
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
Vipula Rawte
Sarthak Jain
Aarush Sinha
Garv Kaushik
Aman Bansal
...
Aishwarya N. Reganti
Vinija Jain
Aman Chadha
A. Sheth
A. Das
VLM
MLLM
32
1
0
16 Nov 2024
Autoregressive Models in Vision: A Survey
Jing Xiong
Gongye Liu
Lun Huang
Chengyue Wu
Taiqiang Wu
...
M. Zhang
Guillermo Sapiro
Jiebo Luo
Ping Luo
Ngai Wong
VGen
38
7
0
08 Nov 2024
Hallo2: Long-Duration and High-Resolution Audio-Driven Portrait Image Animation
Jiahao Cui
Hui Li
Yao Yao
Hao Zhu
Hanlin Shang
Kaihui Cheng
Hang Zhou
Siyu Zhu
Jingdong Wang
DiffM
VGen
19
20
0
10 Oct 2024
CASA: Class-Agnostic Shared Attributes in Vision-Language Models for Efficient Incremental Object Detection
Mingyi Guo
Yuyang Liu
Zongying Lin
Peixi Peng
Yonghong Tian
Yonghong Tian
VLM
27
0
0
08 Oct 2024
SePPO: Semi-Policy Preference Optimization for Diffusion Alignment
Daoan Zhang
Guangchen Lan
Dong-Jun Han
Wenlin Yao
Xiaoman Pan
...
Mingxiao Li
Pengcheng Chen
Yu Dong
Christopher Brinton
Jiebo Luo
EGVM
20
4
0
07 Oct 2024
BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices
Yongqi Xu
Yujian Lee
Gao Yi
Bosheng Liu
Yucong Chen
Peng Liu
Jigang Wu
Xiaoming Chen
Yinhe Han
MQ
21
0
0
25 Sep 2024
OD-VAE: An Omni-dimensional Video Compressor for Improving Latent Video Diffusion Model
Liuhan Chen
Zongjian Li
Bin Lin
Bin Zhu
Qian Wang
Shenghai Yuan
X. Zhou
Xinhua Cheng
Li Yuan
DiffM
83
14
0
02 Sep 2024
Towards Understanding Unsafe Video Generation
Yan Pang
Aiping Xiong
Yang Zhang
Tianhao Wang
EGVM
19
2
0
17 Jul 2024
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan
Rui Xie
Penghao Zhou
Tiehan Fan
Zhenheng Yang
Zhijie Chen
Xiang Li
Jian Yang
Ying Tai
67
68
0
02 Jul 2024
ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Shenghai Yuan
Jinfa Huang
Yongqi Xu
Yaoyang Liu
Shaofeng Zhang
Yujun Shi
Ruijie Zhu
Xinhua Cheng
Jiebo Luo
Li Yuan
EGVM
VGen
66
1
0
26 Jun 2024
MVOC: a training-free multiple video object composition method with diffusion models
Wei Wang
Yaosen Chen
Yuegen Liu
Qi Yuan
Shubin Yang
Yanru Zhang
DiffM
60
2
0
22 Jun 2024
A Large-scale Universal Evaluation Benchmark For Face Forgery Detection
Yijun Bei
Hengrui Lou
Jinsong Geng
Erteng Liu
Lechao Cheng
Jie Song
Mingli Song
Zunlei Feng
CVBM
26
0
0
13 Jun 2024
Real-world Image Dehazing with Coherence-based Label Generator and Cooperative Unfolding Network
Chengyu Fang
Chunming He
Fengyang Xiao
Yulun Zhang
Longxiang Tang
Yuelin Zhang
Kai Li
Xiu Li
32
9
0
12 Jun 2024
Lumiere: A Space-Time Diffusion Model for Video Generation
Omer Bar-Tal
Hila Chefer
Omer Tov
Charles Herrmann
Roni Paiss
...
T. Michaeli
Oliver Wang
Deqing Sun
Tali Dekel
Inbar Mosseri
VGen
101
214
0
23 Jan 2024
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Haoxin Chen
Yong Zhang
Xiaodong Cun
Menghan Xia
Xintao Wang
Chao-Liang Weng
Ying Shan
VGen
DiffM
115
269
0
17 Jan 2024
MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation
Weimin Wang
Jiawei Liu
Zhijie Lin
Jiangqiao Yan
Shuo Chen
...
Jie Wu
Jun Hao Liew
Hanshu Yan
Daquan Zhou
Jiashi Feng
VGen
DiffM
68
17
0
09 Jan 2024
Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets
A. Blattmann
Tim Dockhorn
Sumith Kulal
Daniel Mendelevitch
Maciej Kilian
...
Zion English
Vikram S. Voleti
Adam Letts
Varun Jampani
Robin Rombach
VGen
150
985
0
25 Nov 2023
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
Chunming He
Chengyu Fang
Yulun Zhang
Chenyu You
Kai Li
Longxiang Tang
Fengyang Xiao
Xiu Li
Z. Guo
21
24
0
20 Nov 2023
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Bin Lin
Yang Ye
Bin Zhu
Jiaxi Cui
Munan Ning
Peng Jin
Li-ming Yuan
VLM
MLLM
182
576
0
16 Nov 2023
Training-Free Layout Control with Cross-Attention Guidance
Minghao Chen
Iro Laina
Andrea Vedaldi
DiffM
121
217
0
06 Apr 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Axel Sauer
Katja Schwarz
Andreas Geiger
174
354
0
01 Feb 2022
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
3,790
0
24 Feb 2021
1