ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2312.06662
  4. Cited By
Photorealistic Video Generation with Diffusion Models

Photorealistic Video Generation with Diffusion Models

11 December 2023
Agrim Gupta
Lijun Yu
Kihyuk Sohn
Xiuye Gu
Meera Hahn
Fei-Fei Li
Irfan Essa
Lu Jiang
José Lezama
    VGen
ArXivPDFHTML

Papers citing "Photorealistic Video Generation with Diffusion Models"

42 / 142 papers shown
Title
Track2Act: Predicting Point Tracks from Internet Videos enables Diverse
  Zero-shot Robot Manipulation
Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation
Homanga Bharadhwaj
Roozbeh Mottaghi
Abhinav Gupta
Shubham Tulsiani
3DPC
41
15
0
02 May 2024
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Olivia Wiles
Chuhan Zhang
Isabela Albuquerque
Ivana Kajić
Su Wang
...
Jordi Pont-Tuset
Aida Nematzadeh
Anant Nawalgaria
Jordi Pont-Tuset
Aida Nematzadeh
EGVM
120
13
0
25 Apr 2024
PhysDreamer: Physics-Based Interaction with 3D Objects via Video
  Generation
PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation
Tianyuan Zhang
Hong-Xing Yu
Rundi Wu
Brandon Yushan Feng
Changxi Zheng
Noah Snavely
Jiajun Wu
William T. Freeman
AI4CE
VGen
77
61
0
19 Apr 2024
On the Content Bias in Fréchet Video Distance
On the Content Bias in Fréchet Video Distance
Jason S. Hoffman
Aniruddha Mahapatra
Gaurav Parmar
Jun-Yan Zhu
Jia-Bin Huang
EGVM
50
15
0
18 Apr 2024
AniClipart: Clipart Animation with Text-to-Video Priors
AniClipart: Clipart Animation with Text-to-Video Priors
Rong Wu
Wanchao Su
Kede Ma
Jing Liao
24
3
0
18 Apr 2024
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse
  Controls to Any Diffusion Model
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Han Lin
Jaemin Cho
Abhaysinh Zala
Mohit Bansal
DiffM
VGen
58
20
0
15 Apr 2024
An Overview of Diffusion Models: Applications, Guided Generation,
  Statistical Rates and Optimization
An Overview of Diffusion Models: Applications, Guided Generation, Statistical Rates and Optimization
Minshuo Chen
Song Mei
Jianqing Fan
Mengdi Wang
VLM
MedIm
DiffM
32
48
0
11 Apr 2024
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
GoodDrag: Towards Good Practices for Drag Editing with Diffusion Models
Zewei Zhang
Huan Liu
Jun Chen
Xiangyu Xu
DiffM
19
8
0
10 Apr 2024
GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis
GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis
S. Sastry
Subash Khanal
A. Dhakal
Nathan Jacobs
47
6
0
09 Apr 2024
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale
  Prediction
Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction
Keyu Tian
Yi-Xin Jiang
Zehuan Yuan
Bingyue Peng
Liwei Wang
VGen
25
248
0
03 Apr 2024
Deepfake Generation and Detection: A Benchmark and Survey
Deepfake Generation and Detection: A Benchmark and Survey
Gan Pei
Jiangning Zhang
Menghan Hu
Zhenyu Zhang
Chengjie Wang
Yunsheng Wu
Guangtao Zhai
Jian Yang
Chunhua Shen
Dacheng Tao
38
24
0
26 Mar 2024
SD-DiT: Unleashing the Power of Self-supervised Discrimination in
  Diffusion Transformer
SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion Transformer
Rui Zhu
Yingwei Pan
Yehao Li
Ting Yao
Zhenglong Sun
Tao Mei
C. Chen
48
23
0
25 Mar 2024
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Mora: Enabling Generalist Video Generation via A Multi-Agent Framework
Zhengqing Yuan
Ruoxi Chen
Zhaoxu Li
Haolong Jia
Lifang He
Chi Wang
Lichao Sun
VGen
50
27
0
20 Mar 2024
FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video
  Editing
FastVideoEdit: Leveraging Consistency Models for Efficient Text-to-Video Editing
Youyuan Zhang
Xuan Ju
James J. Clark
VGen
DiffM
32
6
0
10 Mar 2024
Sora: A Review on Background, Technology, Limitations, and Opportunities
  of Large Vision Models
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu
Kai Zhang
Yuan Li
Zhiling Yan
Chujie Gao
...
Yue Huang
Hanchi Sun
Jianfeng Gao
Lifang He
Lichao Sun
VLM
VGen
EGVM
65
241
0
27 Feb 2024
Accelerating Parallel Sampling of Diffusion Models
Accelerating Parallel Sampling of Diffusion Models
Zhiwei Tang
Jiasheng Tang
Hao Luo
Fan Wang
Tsung-Hui Chang
30
11
0
15 Feb 2024
Rolling Diffusion Models
Rolling Diffusion Models
David Ruhe
Jonathan Heek
Tim Salimans
Emiel Hoogeboom
DiffM
23
32
0
12 Feb 2024
DiscDiff: Latent Diffusion Model for DNA Sequence Generation
DiscDiff: Latent Diffusion Model for DNA Sequence Generation
Zehui Li
Yuhao Ni
W. Beardall
Guoxuan Xia
Akashaditya Das
Guy-Bart Stan
Yiren Zhao
11
6
0
08 Feb 2024
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded
  Modelling
CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling
Junchao Gong
Lei Bai
Peng Ye
Wanghan Xu
Na Liu
Jianhua Dai
Xiaokang Yang
Wanli Ouyang
AI4Cl
40
10
0
06 Feb 2024
Lumiere: A Space-Time Diffusion Model for Video Generation
Lumiere: A Space-Time Diffusion Model for Video Generation
Omer Bar-Tal
Hila Chefer
Omer Tov
Charles Herrmann
Roni Paiss
...
T. Michaeli
Oliver Wang
Deqing Sun
Tali Dekel
Inbar Mosseri
VGen
101
214
0
23 Jan 2024
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Zachary Novack
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
DiffM
16
32
0
22 Jan 2024
WorldDreamer: Towards General World Models for Video Generation via
  Predicting Masked Tokens
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Xiaofeng Wang
Zheng Zhu
Guan Huang
Boyuan Wang
Xinze Chen
Jiwen Lu
VGen
27
32
0
18 Jan 2024
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable
  Interpolant Transformers
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers
Nanye Ma
Mark Goldstein
M. S. Albergo
Nicholas M. Boffi
Eric Vanden-Eijnden
Saining Xie
DiffM
27
163
0
16 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Collaboratively Self-supervised Video Representation Learning for Action Recognition
Jie M. Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
56
0
0
15 Jan 2024
Latte: Latent Diffusion Transformer for Video Generation
Latte: Latent Diffusion Transformer for Video Generation
Xin Ma
Yaohui Wang
Gengyun Jia
Xinyuan Chen
Z. Liu
Yuan-Fang Li
Cunjian Chen
Yu Qiao
DiffM
VGen
123
233
0
05 Jan 2024
VideoPoet: A Large Language Model for Zero-Shot Video Generation
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk
Lijun Yu
Xiuye Gu
José Lezama
Jonathan Huang
...
Irfan Essa
Huisheng Wang
David A. Ross
Bryan Seybold
Lu Jiang
VGen
15
235
0
21 Dec 2023
Fine-grained Controllable Video Generation via Object Appearance and
  Context
Fine-grained Controllable Video Generation via Object Appearance and Context
Hsin-Ping Huang
Yu-Chuan Su
Deqing Sun
Lu Jiang
Xuhui Jia
Yukun Zhu
Ming-Hsuan Yang
DiffM
VGen
13
13
0
05 Dec 2023
VBench: Comprehensive Benchmark Suite for Video Generative Models
VBench: Comprehensive Benchmark Suite for Video Generative Models
Ziqi Huang
Yinan He
Jiashuo Yu
Fan Zhang
Chenyang Si
...
Xinyuan Chen
Limin Wang
Dahua Lin
Yu Qiao
Ziwei Liu
VGen
59
341
0
29 Nov 2023
A Survey on Video Diffusion Models
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
52
112
0
16 Oct 2023
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
MDTv2: Masked Diffusion Transformer is a Strong Image Synthesizer
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
DiffM
137
155
0
25 Mar 2023
Muse: Text-To-Image Generation via Masked Generative Transformers
Muse: Text-To-Image Generation via Masked Generative Transformers
Huiwen Chang
Han Zhang
Jarred Barber
AJ Maschinot
José Lezama
...
Kevin Patrick Murphy
William T. Freeman
Michael Rubinstein
Yuanzhen Li
Dilip Krishnan
DiffM
197
517
0
02 Jan 2023
MaskViT: Masked Visual Pre-Training for Video Prediction
MaskViT: Masked Visual Pre-Training for Video Prediction
Agrim Gupta
Stephen Tian
Yunzhi Zhang
Jiajun Wu
Roberto Martín-Martín
Li Fei-Fei
100
110
0
23 Jun 2022
CogVideo: Large-scale Pretraining for Text-to-Video Generation via
  Transformers
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Wenyi Hong
Ming Ding
Wendi Zheng
Xinghan Liu
Jie Tang
DiffM
243
556
0
29 May 2022
Flexible Diffusion Modeling of Long Videos
Flexible Diffusion Modeling of Long Videos
William Harvey
Saeid Naderiparizi
Vaden Masrani
Christian Weilbach
Frank D. Wood
DiffM
BDL
VGen
174
284
0
23 May 2022
Masked Autoencoders Are Scalable Vision Learners
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Emerging Properties in Self-Supervised Vision Transformers
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
VideoGPT: Video Generation using VQ-VAE and Transformers
VideoGPT: Video Generation using VQ-VAE and Transformers
Wilson Yan
Yunzhi Zhang
Pieter Abbeel
A. Srinivas
ViT
VGen
242
482
0
20 Apr 2021
Zero-Shot Text-to-Image Generation
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Transformation-based Adversarial Video Prediction on Large-Scale Data
Transformation-based Adversarial Video Prediction on Large-Scale Data
Pauline Luc
Aidan Clark
Sander Dieleman
Diego de Las Casas
Yotam Doron
Albin Cassirer
Karen Simonyan
VGen
214
86
0
09 Mar 2020
A Style-Based Generator Architecture for Generative Adversarial Networks
A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras
S. Laine
Timo Aila
262
10,183
0
12 Dec 2018
A Learned Representation For Artistic Style
A Learned Representation For Artistic Style
Vincent Dumoulin
Jonathon Shlens
M. Kudlur
GAN
210
1,153
0
24 Oct 2016
U-Net: Convolutional Networks for Biomedical Image Segmentation
U-Net: Convolutional Networks for Biomedical Image Segmentation
Olaf Ronneberger
Philipp Fischer
Thomas Brox
SSeg
3DV
229
74,467
0
18 May 2015
Previous
123