ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2311.10709
  4. Cited By
Emu Video: Factorizing Text-to-Video Generation by Explicit Image
  Conditioning

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

17 November 2023
Rohit Girdhar
Mannat Singh
Andrew Brown
Quentin Duval
S. Azadi
Sai Saketh Rambhatla
Akbar Shah
Xi Yin
Devi Parikh
Ishan Misra
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning"

50 / 159 papers shown
Title
Progressive Autoregressive Video Diffusion Models
Progressive Autoregressive Video Diffusion Models
Desai Xie
Zhan Xu
Yicong Hong
Hao Tan
Difan Liu
Feng Liu
Arie E. Kaufman
Yang Zhou
VGen
DiffM
56
10
0
10 Oct 2024
HARIVO: Harnessing Text-to-Image Models for Video Generation
HARIVO: Harnessing Text-to-Image Models for Video Generation
Mingi Kwon
Seoung Wug Oh
Yang Zhou
Difan Liu
Joon-Young Lee
Haoran Cai
Baqiao Liu
Feng Liu
Youngjung Uh
VGen
38
1
0
10 Oct 2024
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Loong: Generating Minute-level Long Videos with Autoregressive Language Models
Yuqing Wang
Tianwei Xiong
Daquan Zhou
Zhijie Lin
Yang Zhao
Bingyi Kang
Jiashi Feng
Xihui Liu
VGen
46
23
0
03 Oct 2024
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model
  And Input View Curation
Flex3D: Feed-Forward 3D Generation With Flexible Reconstruction Model And Input View Curation
Junlin Han
Jianyuan Wang
Andrea Vedaldi
Philip Torr
Filippos Kokkinos
26
4
0
01 Oct 2024
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Characterizing and Efficiently Accelerating Multimodal Generation Model Inference
Yejin Lee
Anna Y. Sun
Basil Hosmer
Bilge Acun
Can Balioglu
...
Ram Pasunuru
Scott Yih
Sravya Popuri
Xing Liu
Carole-Jean Wu
50
2
0
30 Sep 2024
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation
Shaowei Liu
Zhongzheng Ren
Saurabh Gupta
Shenlong Wang
VGen
DiffM
PINN
39
33
0
27 Sep 2024
Pixel-Space Post-Training of Latent Diffusion Models
Pixel-Space Post-Training of Latent Diffusion Models
Christina Zhang
Simran Motwani
Matthew Yu
Ji Hou
Felix Juefei-Xu
Sam S. Tsai
Peter Vajda
Zijian He
Jialiang Wang
18
2
0
26 Sep 2024
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable
  Robot Manipulation
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
Homanga Bharadhwaj
Debidatta Dwibedi
Abhinav Gupta
Shubham Tulsiani
Carl Doersch
Ted Xiao
Dhruv Shah
Fei Xia
Dorsa Sadigh
Sean Kirmani
VGen
LM&Ro
35
27
0
24 Sep 2024
Think Twice Before You Act: Improving Inverse Problem Solving With MCMC
Think Twice Before You Act: Improving Inverse Problem Solving With MCMC
Y. Zhu
Zehao Dou
Haoxin Zheng
Yasi Zhang
Ying Nian Wu
Ruiqi Gao
DiffM
17
4
0
13 Sep 2024
AMG: Avatar Motion Guided Video Generation
AMG: Avatar Motion Guided Video Generation
Zhangsihao Yang
Mengyi Shan
Mohammad Farazi
Wenhui Zhu
Yanxi Chen
Xuanzhao Dong
Yalin Wang
VGen
DiffM
64
0
0
02 Sep 2024
Compositional 3D-aware Video Generation with LLM Director
Compositional 3D-aware Video Generation with LLM Director
Hanxin Zhu
Tianyu He
Anni Tang
Junliang Guo
Zhibo Chen
Jiang Bian
DiffM
VGen
31
7
0
31 Aug 2024
Factorized-Dreamer: Training A High-Quality Video Generator with Limited
  and Low-Quality Data
Factorized-Dreamer: Training A High-Quality Video Generator with Limited and Low-Quality Data
Tao Yang
Yangming Shi
Yunwen Huang
Feng Chen
Yin Zheng
Lei Zhang
DiffM
VGen
59
0
0
19 Aug 2024
FancyVideo: Towards Dynamic and Consistent Video Generation via
  Cross-frame Textual Guidance
FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Jiasong Feng
Ao Ma
Jing Wang
Bo Cheng
Xiaodan Liang
Dawei Leng
Yuhui Yin
DiffM
VGen
37
6
0
15 Aug 2024
Scene123: One Prompt to 3D Scene Generation via Video-Assisted and
  Consistency-Enhanced MAE
Scene123: One Prompt to 3D Scene Generation via Video-Assisted and Consistency-Enhanced MAE
Yiying Yang
Fukun Yin
Jiayuan Fan
Xin Chen
Wanzhang Li
Gang Yu
VGen
44
0
0
10 Aug 2024
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior
  for Part-Level Dynamics
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
Ruining Li
Chuanxia Zheng
Christian Rupprecht
Andrea Vedaldi
DiffM
VGen
36
9
0
08 Aug 2024
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and
  Illumination Disentanglement
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
Mark Boss
Zixuan Huang
Aaryaman Vasishta
Varun Jampani
3DGS
82
31
0
01 Aug 2024
Anchored Diffusion for Video Face Reenactment
Anchored Diffusion for Video Face Reenactment
I. Kligvasser
Regev Cohen
G. Leifman
Ehud Rivlin
Michael Elad
DiffM
VGen
34
1
0
21 Jul 2024
Still-Moving: Customized Video Generation without Customized Video Data
Still-Moving: Customized Video Generation without Customized Video Data
Hila Chefer
Shiran Zada
Roni Paiss
Ariel Ephrat
Omer Tov
Michael Rubinstein
Lior Wolf
Tali Dekel
T. Michaeli
Inbar Mosseri
DiffM
VGen
26
20
0
11 Jul 2024
MiraData: A Large-Scale Video Dataset with Long Durations and Structured
  Captions
MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions
Xuan Ju
Yiming Gao
Zhaoyang Zhang
Ziyang Yuan
Xintao Wang
Ailing Zeng
Yu Xiong
Qiang Xu
Ying Shan
VGen
61
36
0
08 Jul 2024
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D
  Objects
Meta 3D TextureGen: Fast and Consistent Texture Generation for 3D Objects
Raphael Bensadoun
Yanir Kleiman
Idan Azuri
Omri Harosh
Andrea Vedaldi
Natalia Neverova
Oran Gafni
45
27
0
02 Jul 2024
Text-Aware Diffusion for Policy Learning
Text-Aware Diffusion for Policy Learning
Calvin Luo
Mandy He
Zilai Zeng
Chen Sun
23
4
0
02 Jul 2024
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
Evaluation of Text-to-Video Generation Models: A Dynamics Perspective
Mingxiang Liao
Hannan Lu
Xinyu Zhang
Fang Wan
Tianyu Wang
Yuzhong Zhao
W. Zuo
Qixiang Ye
Jingdong Wang
VGen
EGVM
58
17
0
01 Jul 2024
OccFusion: Rendering Occluded Humans with Generative Diffusion Priors
OccFusion: Rendering Occluded Humans with Generative Diffusion Priors
Adam Sun
Tiange Xiang
Scott Delp
Li Fei-Fei
Ehsan Adeli
29
2
0
29 Jun 2024
Diffusion Model-Based Video Editing: A Survey
Diffusion Model-Based Video Editing: A Survey
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Dacheng Tao
VGen
55
20
0
26 Jun 2024
Identifying and Solving Conditional Image Leakage in Image-to-Video
  Diffusion Model
Identifying and Solving Conditional Image Leakage in Image-to-Video Diffusion Model
Min Zhao
Hongzhou Zhu
Chendong Xiang
Kaiwen Zheng
Chongxuan Li
Jun Zhu
61
8
0
22 Jun 2024
Image Conductor: Precision Control for Interactive Video Synthesis
Image Conductor: Precision Control for Interactive Video Synthesis
Yaowei Li
Xintao Wang
Zhaoyang Zhang
Zhouxia Wang
Ziyang Yuan
Liangbin Xie
Yuexian Zou
Ying Shan
VGen
42
23
0
21 Jun 2024
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human
  Feedback for Video Generation
VideoScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation
Xuan He
Dongfu Jiang
Ge Zhang
Max W.F. Ku
Achint Soni
...
Yaswanth Narsupalli
Rongqi Fan
Zhiheng Lyu
Yuchen Lin
Wenhu Chen
EGVM
VGen
ALM
43
41
0
21 Jun 2024
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video
  Diffusion Models
ViD-GPT: Introducing GPT-style Autoregressive Generation in Video Diffusion Models
Kaifeng Gao
Jiaxin Shi
Hanwang Zhang
Chunping Wang
Jun Xiao
DiffM
VGen
62
11
0
16 Jun 2024
L4GM: Large 4D Gaussian Reconstruction Model
L4GM: Large 4D Gaussian Reconstruction Model
Jiawei Ren
Kevin Xie
Ashkan Mirzaei
Hanxue Liang
Xiaohui Zeng
...
Ziwei Liu
Antonio Torralba
Sanja Fidler
Seung Wook Kim
Huan Ling
3DGS
22
37
0
14 Jun 2024
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
LRM-Zero: Training Large Reconstruction Models with Synthesized Data
Desai Xie
Sai Bi
Zhixin Shu
Kai Zhang
Zexiang Xu
Yi Zhou
Soren Pirk
Arie E. Kaufman
Xin Sun
Hao Tan
SyDa
43
14
0
13 Jun 2024
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing
  Reliability,Reproducibility, and Practicality
Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality
Tianle Zhang
Langtian Ma
Yuchen Yan
Yuchen Zhang
Kai Wang
...
Wenqi Shao
Yang You
Yu Qiao
Ping Luo
Kaipeng Zhang
VGen
58
2
0
13 Jun 2024
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction
  Awareness
HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
Zihui Xue
Mi Luo
Changan Chen
Kristen Grauman
DiffM
22
6
0
11 Jun 2024
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video
  Prediction
AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction
Zhen Xing
Qi Dai
Zejia Weng
Zuxuan Wu
Yu-Gang Jiang
VGen
39
14
0
10 Jun 2024
CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
Xingrui Wang
Xin Li
Zhibo Chen
DiffM
42
1
0
07 Jun 2024
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a
  Single Image
Flash3D: Feed-Forward Generalisable 3D Scene Reconstruction from a Single Image
Stanislaw Szymanowicz
Eldar Insafutdinov
Chuanxia Zheng
Dylan Campbell
João F. Henriques
Christian Rupprecht
Andrea Vedaldi
3DGS
24
49
0
06 Jun 2024
Coherent Zero-Shot Visual Instruction Generation
Coherent Zero-Shot Visual Instruction Generation
Quynh Phung
Songwei Ge
Jia-Bin Huang
47
2
0
06 Jun 2024
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
Yang Sui
Yanyu Li
Anil Kag
Yerlan Idelbayev
Junli Cao
Ju Hu
Dhritiman Sagar
Bo Yuan
Sergey Tulyakov
Jian Ren
MQ
36
17
0
06 Jun 2024
SF-V: Single Forward Video Generation Model
SF-V: Single Forward Video Generation Model
Zhixing Zhang
Yanyu Li
Yushu Wu
Yanwu Xu
Anil Kag
...
Aliaksandr Siarohin
Junli Cao
Dimitris N. Metaxas
Sergey Tulyakov
Jian Ren
DiffM
VGen
31
9
0
06 Jun 2024
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
Qiaowei Miao
Yawei Luo
Yi Yang
3DGS
DiffM
36
7
0
30 May 2024
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo
  Benchmark
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark
Haoxing Chen
Yan Hong
Zizheng Huang
Zhuoer Xu
Zhangxuan Gu
...
Jun Lan
Huijia Zhu
Jianfu Zhang
Weiqiang Wang
Huaxiong Li
Mamba
80
13
0
30 May 2024
Vista: A Generalizable Driving World Model with High Fidelity and
  Versatile Controllability
Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao
Jiazhi Yang
Li Chen
Kashyap Chitta
Yihang Qiu
Andreas Geiger
Jun Zhang
Hongyang Li
60
75
0
27 May 2024
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible
  Pose Control
PoseCrafter: One-Shot Personalized Video Synthesis Following Flexible Pose Control
Yong Zhong
Min Zhao
Zebin You
Xiaofeng Yu
Changwang Zhang
Chongxuan Li
DiffM
29
6
0
23 May 2024
Images that Sound: Composing Images and Sounds on a Single Canvas
Images that Sound: Composing Images and Sounds on a Single Canvas
Ziyang Chen
Daniel Geng
Andrew Owens
DiffM
48
9
0
20 May 2024
FIFO-Diffusion: Generating Infinite Videos from Text without Training
FIFO-Diffusion: Generating Infinite Videos from Text without Training
Jihwan Kim
Junoh Kang
Jinyoung Choi
Bohyung Han
DiffM
VGen
58
23
0
19 May 2024
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
CAT3D: Create Anything in 3D with Multi-View Diffusion Models
Ruiqi Gao
Aleksander Holynski
Philipp Henzler
Arthur Brussee
Ricardo Martín Brualla
Pratul P. Srinivasan
Jonathan T. Barron
Ben Poole
29
149
0
16 May 2024
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation
Hritik Bansal
Yonatan Bitton
Michal Yarom
Idan Szpektor
Aditya Grover
Kai-Wei Chang
DiffM
42
11
0
07 May 2024
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator
  with Diffusion Models
Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models
Fan Bao
Chendong Xiang
Gang Yue
Guande He
Hongzhou Zhu
Kaiwen Zheng
Min Zhao
Shilong Liu
Yaole Wang
Jun Zhu
VGen
110
50
0
07 May 2024
Is Sora a World Simulator? A Comprehensive Survey on General World
  Models and Beyond
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond
Zheng Zhu
Xiaofeng Wang
Wangbo Zhao
Chen Min
Nianchen Deng
...
Dawei Zhao
Liang Xiao
Jian-jun Zhao
Jiwen Lu
Guan Huang
VGen
LM&Ro
79
35
0
06 May 2024
Video Diffusion Models: A Survey
Video Diffusion Models: A Survey
Andrew Melnik
Michal Ljubljanac
Cong Lu
Qi Yan
Weiming Ren
Helge J. Ritter
VGen
63
12
0
06 May 2024
PhysDreamer: Physics-Based Interaction with 3D Objects via Video
  Generation
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang
Hong-Xing Yu
Rundi Wu
Brandon Yushan Feng
Changxi Zheng
Noah Snavely
Jiajun Wu
William T. Freeman
AI4CE
VGen
77
61
0
19 Apr 2024
Previous
1234
Next