ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.04509
  4. Cited By
The Power of Sound (TPoS): Audio Reactive Video Generation with Stable
  Diffusion

The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion

8 September 2023
Yujin Jeong
Won-Wha Ryoo
Seunghyun Lee
Dabin Seo
Wonmin Byeon
Sangpil Kim
Jinkyu Kim
    DiffM
ArXivPDFHTML

Papers citing "The Power of Sound (TPoS): Audio Reactive Video Generation with Stable Diffusion"

22 / 22 papers shown
Title
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Md. Naimur Asif Borno
Md Sakib Hossain Shovon
Asmaa Soliman Al-Moisheer
Mohammad Ali Moni
24
0
0
11 May 2025
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation
Xingrui Wang
Jiang-Long Liu
Z. Wang
Xiaodong Yu
Jialian Wu
X. Sun
Yusheng Su
Alan L. Yuille
Zicheng Liu
Emad Barsoum
DiffM
VGen
46
0
0
13 Apr 2025
Artificial Intelligence for Biomedical Video Generation
Artificial Intelligence for Biomedical Video Generation
Linyuan Li
Jianing Qiu
Anujit Saha
Lin Li
Poyuan Li
Mengxian He
Ziyu Guo
Wu Yuan
VGen
58
1
0
12 Nov 2024
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Language-Guided Joint Audio-Visual Editing via One-Shot Adaptation
Susan Liang
Chao Huang
Yapeng Tian
Anurag Kumar
Chenliang Xu
DiffM
34
6
0
09 Oct 2024
D$^4$M: Dataset Distillation via Disentangled Diffusion Model
D4^44M: Dataset Distillation via Disentangled Diffusion Model
Duo Su
Junjie Hou
Weizhi Gao
Yingjie Tian
Bowen Tang
DD
35
18
0
21 Jul 2024
Masked Generative Video-to-Audio Transformers with Enhanced
  Synchronicity
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
Santiago Pascual
Chunghsin Yeh
Ioannis Tsiamas
Joan Serra
DiffM
VGen
33
13
0
15 Jul 2024
Read, Watch and Scream! Sound Generation from Text and Video
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
25
11
0
08 Jul 2024
Sequential Contrastive Audio-Visual Learning
Sequential Contrastive Audio-Visual Learning
Ioannis Tsiamas
Santiago Pascual
Chunghsin Yeh
Joan Serra
33
2
0
08 Jul 2024
VCoME: Verbal Video Composition with Multimodal Editing Effects
VCoME: Verbal Video Composition with Multimodal Editing Effects
Weibo Gong
Xiaojie Jin
Xin Li
Dongliang He
Xinglong Wu
35
0
0
05 Jul 2024
Diffusion Model-Based Video Editing: A Survey
Diffusion Model-Based Video Editing: A Survey
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Dacheng Tao
VGen
55
22
0
26 Jun 2024
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
Trevine Oorloff
Surya Koppisetti
Nicolò Bonettini
Divyaraj Solanki
Ben Colman
Yaser Yacoob
Ali Shahriyari
Gaurav Bharaj
30
20
0
05 Jun 2024
SonicDiffusion: Audio-Driven Image Generation and Editing with
  Pretrained Diffusion Models
SonicDiffusion: Audio-Driven Image Generation and Editing with Pretrained Diffusion Models
Burak Can Biner
Farrin Marouf Sofian
Umur Berkay Karakacs
Duygu Ceylan
Erkut Erdem
Aykut Erdem
21
7
0
01 May 2024
Audio-Synchronized Visual Animation
Audio-Synchronized Visual Animation
Lin Zhang
Shentong Mo
Yijing Zhang
Pedro Morgado
DiffM
40
18
0
08 Mar 2024
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for
  Text-to-Image Generation
Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation
Seung Hyun Lee
Yinxiao Li
Junjie Ke
Innfarn Yoo
Han Zhang
...
Junfeng He
Gang Li
Sangpil Kim
Irfan Essa
Feng Yang
EGVM
27
18
0
11 Jan 2024
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy
Weijian Mai
Jian Zhang
Pengfei Fang
Zhijun Zhang
37
9
0
31 Dec 2023
Diffusion for Natural Image Matting
Diffusion for Natural Image Matting
Yihan Hu
Yiheng Lin
Wei Wang
Yao-Min Zhao
Yunchao Wei
Humphrey Shi
18
7
0
10 Dec 2023
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional
  Modeling
CMMD: Contrastive Multi-Modal Diffusion for Video-Audio Conditional Modeling
Ruihan Yang
H. Gamper
Sebastian Braun
DiffM
22
5
0
08 Dec 2023
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
ElasticDiffusion: Training-free Arbitrary Size Image Generation through Global-Local Content Separation
Moayed Haji-Ali
Guha Balakrishnan
Vicente Ordonez
35
23
0
30 Nov 2023
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of
  Latent-Based Diffusion Models
Synthetic Shifts to Initial Seed Vector Exposes the Brittle Nature of Latent-Based Diffusion Models
Poyuan Mao
Shashank Kotyan
Tham Yik Foong
Danilo Vasconcellos Vargas
22
5
0
24 Nov 2023
A Survey on Video Diffusion Models
A Survey on Video Diffusion Models
Zhen Xing
Qijun Feng
Haoran Chen
Qi Dai
Hang-Rui Hu
Hang Xu
Zuxuan Wu
Yu-Gang Jiang
EGVM
VGen
55
115
0
16 Oct 2023
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware
  Motion Model
EAMM: One-Shot Emotional Talking Face via Audio-Based Emotion-Aware Motion Model
Xinya Ji
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Wayne Wu
Feng Xu
Xun Cao
CVBM
50
157
0
30 May 2022
Sound2Sight: Generating Visual Dynamics from Sound and Context
Sound2Sight: Generating Visual Dynamics from Sound and Context
A. Cherian
Moitreya Chatterjee
N. Ahuja
VGen
64
35
0
23 Jul 2020
1