Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.15247
Cited By
SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis
23 October 2023
Marco Comunità
R. F. Gramaccioni
Emilian Postolache
Emanuele Rodolà
Danilo Comminiello
Joshua D. Reiss
DiffM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SyncFusion: Multimodal Onset-synchronized Video-to-Audio Foley Synthesis"
13 / 13 papers shown
Title
TARO: Timestep-Adaptive Representation Alignment with Onset-Aware Conditioning for Synchronized Video-to-Audio Synthesis
Tri Ton
Ji Woo Hong
Chang D. Yoo
VGen
24
0
0
08 Apr 2025
KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation
Yoonjin Chung
Pilsun Eu
Junwon Lee
Keunwoo Choi
Juhan Nam
Ben Sangbae Chon
EGVM
57
3
0
21 Feb 2025
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
63
4
0
26 Sep 2024
Guess What I Think: Streamlined EEG-to-Image Generation with Latent Diffusion Models
Eleonora Lopez
Luigi Sigillo
Federica Colonnese
Massimo Panella
Danilo Comminiello
DiffM
41
1
0
17 Sep 2024
Efficient Video to Audio Mapper with Visual Scene Detection
Mingjing Yi
Ming Li
VGen
15
3
0
15 Sep 2024
MambaFoley: Foley Sound Generation using Selective State-Space Models
Marco Furio Colombo
Francesca Ronchini
Luca Comanducci
Fabio Antonacci
Mamba
20
1
0
13 Sep 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Yong Ren
Chenxing Li
Manjie Xu
Wei Liang
Yu Gu
Rilin Chen
Dong Yu
VGen
DiffM
43
6
0
13 Sep 2024
D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matching
Jingyu Liu
Minquan Wang
Ye Ma
Bo Wang
Aozhu Chen
Quan Chen
Peng Jiang
Xirong Li
38
1
0
23 Aug 2024
Video-Foley: Two-Stage Video-To-Sound Generation via Temporal Event Condition For Foley Sound
Junwon Lee
Jaekwon Im
Dabin Kim
Juhan Nam
VGen
18
9
0
21 Aug 2024
Read, Watch and Scream! Sound Generation from Text and Video
Yujin Jeong
Yunji Kim
Sanghyuk Chun
Jiyoung Lee
VGen
DiffM
25
11
0
08 Jul 2024
AudioTime: A Temporally-aligned Audio-text Benchmark Dataset
Zeyu Xie
Xuenan Xu
Zhizheng Wu
Mengyue Wu
AuLLM
40
5
0
03 Jul 2024
SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond
Marco Comunità
Zhi-Wei Zhong
Akira Takahashi
Shiqi Yang
Mengjie Zhao
Koichi Saito
Yukara Ikemiya
Takashi Shibuya
Shusuke Takahashi
Yuki Mitsufuji
47
2
0
25 Jun 2024
Naturalistic Music Decoding from EEG Data via Latent Diffusion Models
Emilian Postolache
Natalia Polouliakh
Hiroaki Kitano
Akima Connelly
Emanuele Rodolà
Luca Cosmo
Taketo Akama
MedIm
DiffM
35
2
0
15 May 2024
1