ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.03011
  4. Cited By
Structure and Content-Guided Video Synthesis with Diffusion Models

Structure and Content-Guided Video Synthesis with Diffusion Models

6 February 2023
Patrick Esser
Johnathan Chiu
Parmida Atighehchian
Jonathan Granskog
Anastasis Germanidis
    DiffM
    VGen
ArXivPDFHTML

Papers citing "Structure and Content-Guided Video Synthesis with Diffusion Models"

50 / 422 papers shown
Title
C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation
Yuhao Li
Mirana Claire Angel
Salman Khan
Yu Zhu
Jinqiu Sun
Yanning Zhang
F. Khan
VGen
46
0
0
27 Feb 2025
ASurvey: Spatiotemporal Consistency in Video Generation
ASurvey: Spatiotemporal Consistency in Video Generation
Zhiyu Yin
Kehai Chen
Xuefeng Bai
Ruili Jiang
J. Li
Hongdong Li
Jin Liu
Yang Xiang
Jun Yu
Min Zhang
EGVM
VGen
AI4TS
54
0
0
25 Feb 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
47
0
0
23 Feb 2025
Hardware-Friendly Static Quantization Method for Video Diffusion Transformers
Hardware-Friendly Static Quantization Method for Video Diffusion Transformers
Sanghyun Yi
Qingfeng Liu
Mostafa El-Khamy
MQ
VGen
35
0
0
20 Feb 2025
FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion
FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion
Yufan Zhou
Haoyu Shen
Huan Wang
DiffM
97
0
0
17 Feb 2025
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Pingping Zhang
Jinlong Li
Kecheng Chen
Meng Wang
Long Xu
Haoliang Li
N. Sebe
Sam Kwong
Shiqi Wang
VGen
115
3
0
17 Feb 2025
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
SayAnything: Audio-Driven Lip Synchronization with Conditional Video Diffusion
Junxian Ma
Shiwen Wang
Jian Yang
Junyi Hu
Jian Liang
Guosheng Lin
Jingbo Chen
Kai Li
Yu Meng
DiffM
VGen
61
3
0
17 Feb 2025
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Efficient-vDiT: Efficient Video Diffusion Transformers With Attention Tile
Hangliang Ding
Dacheng Li
Runlong Su
Peiyuan Zhang
Zhijie Deng
Ion Stoica
Hao Zhang
VGen
65
4
0
10 Feb 2025
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion
Marco Mistretta
Alberto Baldrati
Lorenzo Agnolucci
Marco Bertini
Andrew D. Bagdanov
CLIP
VLM
99
2
0
06 Feb 2025
IPO: Iterative Preference Optimization for Text-to-Video Generation
IPO: Iterative Preference Optimization for Text-to-Video Generation
Xiaomeng Yang
Zhiyu Tan
Xuecheng Nie
VGen
101
1
0
04 Feb 2025
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation
Zheng Chong
Wenqing Zhang
Shiyue Zhang
Jun Zheng
Xiao Dong
Haoxiang Li
Yiling Wu
D. Jiang
Xiaodan Liang
DiffM
26
1
0
20 Jan 2025
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Qffusion: Controllable Portrait Video Editing via Quadrant-Grid Attention Learning
Maomao Li
Lijian Lin
Yunfei Liu
Ye Zhu
Yu Li
DiffM
VGen
39
0
0
11 Jan 2025
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning
Yuzhou Huang
Ziyang Yuan
Quande Liu
Qiulin Wang
Xintao Wang
Ruimao Zhang
Pengfei Wan
Di Zhang
Kun Gai
VGen
DiffM
35
10
0
08 Jan 2025
GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking
Weikang Bian
Zhaoyang Huang
Xiaoyu Shi
Yijin Li
Fu-Yun Wang
Hongsheng Li
3DGS
VGen
DiffM
34
3
0
05 Jan 2025
Edicho: Consistent Image Editing in the Wild
Edicho: Consistent Image Editing in the Wild
Qingyan Bai
Hao Ouyang
Yinghao Xu
Qiuyu Wang
Ceyuan Yang
Ka Leong Cheng
Yujun Shen
Qifeng Chen
DiffM
65
1
0
30 Dec 2024
SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical
  Video Generation
SurgSora: Decoupled RGBD-Flow Diffusion Model for Controllable Surgical Video Generation
Tong Chen
Shuya Yang
Junyi Wang
Long Bai
Hongliang Ren
Luping Zhou
VGen
MedIm
75
2
0
18 Dec 2024
RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone
RapidNet: Multi-Level Dilated Convolution Based Mobile Backbone
Mustafa Munir
Md Mostafijur Rahman
R. Marculescu
MedIm
ViT
62
0
0
14 Dec 2024
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame
  Organizer
UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer
Delong Liu
Zhaohui Hou
Mingjie Zhan
Shihao Han
Zhicheng Zhao
Fei Su
VGen
91
0
0
12 Dec 2024
DIVE: Taming DINO for Subject-Driven Video Editing
DIVE: Taming DINO for Subject-Driven Video Editing
Yi Huang
Wei Xiong
He Zhang
Chaoqi Chen
Jianzhuang Liu
Mingfu Yan
Shifeng Chen
VGen
DiffM
73
0
0
04 Dec 2024
InfinityDrive: Breaking Time Limits in Driving World Models
InfinityDrive: Breaking Time Limits in Driving World Models
Xi Guo
C. Ding
Haoxuan Dou
Xin Zhang
Weixuan Tang
Wei Yu Wu
VGen
81
5
0
02 Dec 2024
MoTrans: Customized Motion Transfer with Text-driven Video Diffusion
  Models
MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models
Xiaomin Li
Xu Jia
Qinghe Wang
Haiwen Diao
Mengmeng Ge
Pengxiang Li
You He
Huchuan Lu
VGen
DiffM
60
3
0
02 Dec 2024
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses
Yatian Pang
Bin Zhu
Bin Lin
Mingzhe Zheng
Francis E. H. Tay
Ser-Nam Lim
Harry Yang
Li Yuan
VGen
3DH
79
2
0
30 Nov 2024
SPAgent: Adaptive Task Decomposition and Model Selection for General
  Video Generation and Editing
SPAgent: Adaptive Task Decomposition and Model Selection for General Video Generation and Editing
Rong-Cheng Tu
Wenhao Sun
Zhao Jin
Jingyi Liao
Jiaxing Huang
Dacheng Tao
VGen
DiffM
92
3
0
28 Nov 2024
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Sonic: Shifting Focus to Global Audio Perception in Portrait Animation
Xiaozhong Ji
Xiaobin Hu
Zhihong Xu
Junwei Zhu
Chuming Lin
...
Donghao Luo
Yi Chen
Qin Lin
Qinglin Lu
Chengjie Wang
VGen
65
3
0
25 Nov 2024
UVCG: Leveraging Temporal Consistency for Universal Video Protection
UVCG: Leveraging Temporal Consistency for Universal Video Protection
KaiZhou Li
Jindong Gu
Xinchun Yu
Junjie Cao
Yansong Tang
Xiao-Ping Zhang
AAML
74
0
0
25 Nov 2024
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Human-Activity AGV Quality Assessment: A Benchmark Dataset and an Objective Evaluation Metric
Zhichao Zhang
Wei Sun
Xinyue Li
Yunhao Li
Qihang Ge
...
Zhongpeng Ji
Fengyu Sun
Shangling Jui
Xiongkuo Min
Guangtao Zhai
EGVM
117
1
0
25 Nov 2024
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
Sundar Sripada V. S.
Minkyu Choi
Sahil Shah
Harsh Goel
Mohammad Omama
Sandeep P. Chinchali
EGVM
108
2
0
22 Nov 2024
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu
Mingyu Liu
Zeyu Zhu
Xi Xia
Haoen Feng
Wen Wang
Kevin Qinghong Lin
Chunhua Shen
Mike Zheng Shou
DiffM
VGen
114
1
0
22 Nov 2024
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input
Zhen Lv
Yangqi Long
Congzhentao Huang
Cao Li
Chengfei Lv
Hao Ren
Dian Zheng
DiffM
VGen
MDE
112
5
0
18 Nov 2024
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion
  Models
OnlyFlow: Optical Flow based Motion Conditioning for Video Diffusion Models
Mathis Koroglu
Hugo Caselles-Dupré
Guillaume Jeanneret Sanmiguel
Matthieu Cord
VGen
DiffM
20
1
0
15 Nov 2024
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
EchoMimicV2: Towards Striking, Simplified, and Semi-Body Human Animation
Rang Meng
Xingyu Zhang
Yuming Li
Chenguang Ma
26
5
0
15 Nov 2024
World Models: The Safety Perspective
World Models: The Safety Perspective
Zifan Zeng
Chongzhe Zhang
Feng Liu
Joseph Sifakis
Qunli Zhang
Shiming Liu
Peng Wang
KELM
LLMAG
40
1
0
12 Nov 2024
StoryAgent: Customized Storytelling Video Generation via Multi-Agent
  Collaboration
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
Panwen Hu
Jin Jiang
Jianqi Chen
Mingfei Han
Shengcai Liao
Xiaojun Chang
Xiaodan Liang
VGen
DiffM
33
5
0
07 Nov 2024
Optical Flow Representation Alignment Mamba Diffusion Model for Medical
  Video Generation
Optical Flow Representation Alignment Mamba Diffusion Model for Medical Video Generation
Zhenbin Wang
Lei Zhang
Lituan Wang
Minjuan Zhu
Zhenwei Zhang
VGen
MedIm
54
1
0
03 Nov 2024
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video
  Generation
SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation
Yining Hong
Beide Liu
Maxine Wu
Yuanhao Zhai
Kai-Wei Chang
...
Chung-Ching Lin
Jianfeng Wang
Z. Yang
Yingnian Wu
Lijuan Wang
VGen
35
6
0
30 Oct 2024
Investigating Memorization in Video Diffusion Models
Investigating Memorization in Video Diffusion Models
C. L. P. Chen
Enhuai Liu
Daochang Liu
M. Shah
Chang Xu
VGen
DiffM
76
1
0
29 Oct 2024
Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model
  in Irregular Longitudinal Sequences
Extrapolating Prospective Glaucoma Fundus Images through Diffusion Model in Irregular Longitudinal Sequences
Zhihao Zhao
Junjie Yang
Shahrooz Faghihroohi
Yinzheng Zhao
Daniel Zapp
Kai-Qi Huang
Nassir Navab
M. A. Nasseri
DiffM
MedIm
54
0
0
28 Oct 2024
Video to Video Generative Adversarial Network for Few-shot Learning
  Based on Policy Gradient
Video to Video Generative Adversarial Network for Few-shot Learning Based on Policy Gradient
Yintai Ma
Diego Klabjan
J. Utke
VGen
GAN
36
0
0
28 Oct 2024
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise
  Motion Control
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control
Yujie Wei
Shiwei Zhang
Hangjie Yuan
Xiang Wang
Haonan Qiu
...
F. Liu
Zhizhong Huang
Jiaxin Ye
Yingya Zhang
Hongming Shan
DiffM
VGen
69
14
0
17 Oct 2024
Shaping a Stabilized Video by Mitigating Unintended Changes for
  Concept-Augmented Video Editing
Shaping a Stabilized Video by Mitigating Unintended Changes for Concept-Augmented Video Editing
Mingce Guo
Jingxuan He
Shengeng Tang
Zhangye Wang
Lechao Cheng
VGen
DiffM
18
0
0
16 Oct 2024
Hessian-Informed Flow Matching
Hessian-Informed Flow Matching
Christopher Iliffe Sprague
Arne Elofsson
Hossein Azizpour
13
0
0
15 Oct 2024
Exploring Behavior-Relevant and Disentangled Neural Dynamics with
  Generative Diffusion Models
Exploring Behavior-Relevant and Disentangled Neural Dynamics with Generative Diffusion Models
Yule Wang
Chengrui Li
Weihan Li
Anqi Wu
DiffM
26
4
0
12 Oct 2024
E-Motion: Future Motion Simulation via Event Sequence Diffusion
E-Motion: Future Motion Simulation via Event Sequence Diffusion
Song Wu
Zhiyu Zhu
Junhui Hou
Guangming Shi
Jinjian Wu
DiffM
VGen
35
0
0
11 Oct 2024
Story-Adapter: A Training-free Iterative Framework for Long Story
  Visualization
Story-Adapter: A Training-free Iterative Framework for Long Story Visualization
Jiawei Mao
Xiaoke Huang
Yunfei Xie
Yuanqi Chang
Mude Hui
Bingjie Xu
Yuyin Zhou
VGen
DiffM
41
0
0
08 Oct 2024
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through
  Data, Reward, and Conditional Guidance Design
T2V-Turbo-v2: Enhancing Video Generation Model Post-Training through Data, Reward, and Conditional Guidance Design
Jiachen Li
Qian Long
Jian Zheng
Xiaofeng Gao
Robinson Piramuthu
Wenhu Chen
William Yang Wang
VGen
25
22
0
08 Oct 2024
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way
Jiazi Bu
Pengyang Ling
Pan Zhang
Tong Wu
Xiaoyi Dong
Yuhang Zang
Yuhang Cao
Dahua Lin
Jiaqi Wang
DiffM
VGen
28
0
0
08 Oct 2024
GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting
GS-VTON: Controllable 3D Virtual Try-on with Gaussian Splatting
Yukang Cao
Masoud Hadi
Liang Pan
Ziwei Liu
3DGS
DiffM
50
4
0
07 Oct 2024
L-C4: Language-Based Video Colorization for Creative and Consistent
  Color
L-C4: Language-Based Video Colorization for Creative and Consistent Color
Zheng Chang
Shuchen Weng
Huan Ouyang
Yu Li
Si Li
Boxin Shi
DiffM
VGen
VLM
20
0
0
07 Oct 2024
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion
  Correction
ACDC: Autoregressive Coherent Multimodal Generation using Diffusion Correction
Hyungjin Chung
Dohun Lee
Jong Chul Ye
VGen
DiffM
21
2
0
07 Oct 2024
CAR: Controllable Autoregressive Modeling for Visual Generation
CAR: Controllable Autoregressive Modeling for Visual Generation
Ziyu Yao
Jialin Li
Yifeng Zhou
Yong Liu
Xi Jiang
Chengjie Wang
Feng Zheng
Yuexian Zou
Lei Li
DiffM
35
13
0
07 Oct 2024
Previous
123456789
Next