SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis

SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis

23 October 2023

Marco Comunità

R. F. Gramaccioni

Emilian Postolache

Emanuele Rodolà

Danilo Comminiello

Joshua D. Reiss

Papers citing "SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis"

13 / 13 papers shown

Title
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis Tri Ton Ji Woo Hong Chang D. Yoo VGen 24 0 0 08 Apr 2025
KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation Yoonjin Chung Pilsun Eu Junwon Lee Keunwoo Choi Juhan Nam Ben Sangbae Chon EGVM 57 3 0 21 Feb 2025
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation Masato Ishii Akio Hayakawa Takashi Shibuya Yuki Mitsufuji VGen DiffM 63 4 0 26 Sep 2024
Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models Eleonora Lopez Luigi Sigillo Federica Colonnese Massimo Panella Danilo Comminiello DiffM 41 1 0 17 Sep 2024
Efficient Video to Audio Mapper with Visual Scene Detection Mingjing Yi Ming Li VGen 15 3 0 15 Sep 2024
MambaFoley: Foley Sound Generation using Selective State-Space Models Marco Furio Colombo Francesca Ronchini Luca Comanducci Fabio Antonacci Mamba 20 1 0 13 Sep 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment Yong Ren Chenxing Li Manjie Xu Wei Liang Yu Gu Rilin Chen Dong Yu VGen DiffM 43 6 0 13 Sep 2024
D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matching Jingyu Liu Minquan Wang Ye Ma Bo Wang Aozhu Chen Quan Chen Peng Jiang Xirong Li 38 1 0 23 Aug 2024
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound Junwon Lee Jaekwon Im Dabin Kim Juhan Nam VGen 18 9 0 21 Aug 2024
Read, Watch and Scream! Sound Generation from Text and Video Yujin Jeong Yunji Kim Sanghyuk Chun Jiyoung Lee VGen DiffM 25 11 0 08 Jul 2024
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset Zeyu Xie Xuenan Xu Zhizheng Wu Mengyue Wu AuLLM 40 5 0 03 Jul 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond Marco Comunità Zhi-Wei Zhong Akira Takahashi Shiqi Yang Mengjie Zhao Koichi Saito Yukara Ikemiya Takashi Shibuya Shusuke Takahashi Yuki Mitsufuji 47 2 0 25 Jun 2024
Naturalistic Music Decoding from EEG Data via Latent Diffusion Models Emilian Postolache Natalia Polouliakh Hiroaki Kitano Akima Connelly Emanuele Rodolà Luca Cosmo Taketo Akama MedIm DiffM 35 2 0 15 May 2024