Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.00320
Cited By
v1
v2
v3
v4 (latest)
Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching
1 June 2024
Yongqi Wang
Wenxiang Guo
Rongjie Huang
Jia-Bin Huang
Zehan Wang
Fuming You
Ruiqi Li
Zhou Zhao
VGen
DiffM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Frieren: Efficient Video-to-Audio Generation Network with Rectified Flow Matching"
10 / 10 papers shown
Title
Flow Diverse and Efficient: Learning Momentum Flow Matching via Stochastic Velocity Field Sampling
Zhiyuan Ma
Ruixun Liu
Sixian Liu
Jianjun Li
Bowen Zhou
18
0
0
10 Jun 2025
AudioGenie: A Training-Free Multi-Agent Framework for Diverse Multimodality-to-Multiaudio Generation
Yan Rong
Jinting Wang
Shan Yang
Guangzhi Lei
Li Liu
DiffM
VGen
73
0
0
28 May 2025
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model
Yong Ren
Chenxing Li
Le Xu
Hao Gu
Duzhen Zhang
Yujie Chen
Manjie Xu
Ruibo Fu
Shan Yang
Dong Yu
LRM
84
0
0
19 May 2025
AudioX: Diffusion Transformer for Anything-to-Audio Generation
Zeyue Tian
Yizhu Jin
Zhaoyang Liu
Ruibin Yuan
Xu Tan
Qifeng Chen
Wei Xue
Yu Guo
114
6
0
13 Mar 2025
MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm
Ziyan Guo
Zeyu Hu
Na Zhao
De Wen Soh
VGen
199
3
0
13 Mar 2025
KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation
Yoonjin Chung
Pilsun Eu
Junwon Lee
Keunwoo Choi
Juhan Nam
Ben Sangbae Chon
EGVM
107
4
0
21 Feb 2025
MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Ho Kei Cheng
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Alex Schwing
Yuki Mitsufuji
VGen
288
18
0
19 Dec 2024
Gotta Hear Them All: Sound Source Aware Vision to Audio Generation
Wei Guo
Heng Wang
Jianbo Ma
Weidong Cai
DiffM
176
5
0
23 Nov 2024
A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Masato Ishii
Akio Hayakawa
Takashi Shibuya
Yuki Mitsufuji
VGen
DiffM
161
4
0
26 Sep 2024
STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Yong Ren
Chenxing Li
Manjie Xu
Wei Liang
Yu Gu
Rilin Chen
Dong Yu
VGen
DiffM
99
9
0
13 Sep 2024
1