Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.04509
Cited By
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion
8 September 2023
Yujin Jeong
Won-Wha Ryoo
Seunghyun Lee
Dabin Seo
Wonmin Byeon
Sangpil Kim
Jinkyu Kim
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion"
22 / 22 papers shown
Title
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Md. Naimur Asif Borno
Md Sakib Hossain Shovon
Asmaa Soliman Al-Moisheer
Mohammad Ali Moni
24
0
0
11 May 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Z. Wang
Xiaodong Yu
Jialian Wu
X. Sun
Yusheng Su
Alan L. Yuille
Zicheng Liu
Emad Barsoum
DiffM
VGen
46
0
0
13 Apr 2025
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
34
6
0
09 Oct 2024
D
4
^4
4
M: Dataset Distillation via Disentangled Diffusion Model
Duo Su
Junjie Hou
Weizhi Gao
Yingjie Tian
Bowen Tang
DD
35
18
0
21 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serra
DiffM
VGen
33
13
0
15 Jul 2024
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
25
11
0
08 Jul 2024
Sequential Contrastive Audio-Visual Learning
Ioannis Tsiamas
Santiago Pascual
Chunghsin Yeh
Joan Serra
33
2
0
08 Jul 2024
VCoME: Verbal Video Composition with Multimodal Editing Effects
Weibo Gong
Xiaojie Jin
Xin Li
Dongliang He
Xinglong Wu
35
0
0
05 Jul 2024
Diffusion Model-Based Video Editing: A Survey
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Dacheng Tao
VGen
55
22
0
26 Jun 2024
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
Trevine Oorloff
Surya Koppisetti
Nicolò Bonettini
Divyaraj Solanki
Ben Colman
Yaser Yacoob
Ali Shahriyari
Gaurav Bharaj
30
20
0
05 Jun 2024
SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models
Burak Can Biner
Farrin Marouf Sofian
Umur Berkay Karakacs
Duygu Ceylan
Erkut Erdem
Aykut Erdem
21
7
0
01 May 2024
Audio-Synchronized Visual Animation
Lin Zhang
Shentong Mo
Yijing Zhang
Pedro Morgado
DiffM
40
18
0
08 Mar 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee
Yinxiao Li
Junjie Ke
Innfarn Yoo
Han Zhang
...
Junfeng He
Gang Li
Sangpil Kim
Irfan Essa
Feng Yang
EGVM
27
18
0
11 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
37
9
0
31 Dec 2023
Diffusion for Natural Image Matting
Yihan Hu
Yiheng Lin
Wei Wang
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
18
7
0
10 Dec 2023
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling
Ruihan Yang
H. Gamper
Sebastian Braun
DiffM
22
5
0
08 Dec 2023
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Moayed Haji-Ali
Guha Balakrishnan
Vicente Ordonez
35
23
0
30 Nov 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Poyuan Mao
Shashank Kotyan
Tham Yik Foong
Danilo Vasconcellos Vargas
22
5
0
24 Nov 2023
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
55
115
0
16 Oct 2023
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Xinya Ji
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Wayne Wu
Feng Xu
Xun Cao
CVBM
50
157
0
30 May 2022
Sound2Sight: Generating Visual Dynamics from Sound and Context
A. Cherian
Moitreya Chatterjee
N. Ahuja
VGen
64
35
0
23 Jul 2020
1