Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.10057
Cited By
The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation
16 November 2023
Ilaria Manco
Benno Weck
Seungheon Doh
Minz Won
Yixiao Zhang
Dmitry Bodganov
Yusong Wu
Ke Chen
Philip Tovstogan
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
Juhan Nam
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation"
23 / 23 papers shown
Title
DRAGON: Distributional Rewards Optimize Diffusion Generative Models
Yatong Bai
Jonah Casebeer
Somayeh Sojoudi
Nicholas J. Bryan
DiffM
VLM
39
1
0
21 Apr 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
W. Dong
Changsheng Xu
57
0
0
17 Apr 2025
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
Siyuan Hou
Shansong Liu
Ruibin Yuan
Wei Xue
Ying Shan
Mangsuo Zhao
Chao Zhang
87
3
0
17 Jan 2025
SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers
Joseph Liu
Joshua Geddes
Ziyu Guo
Haomiao Jiang
Mahesh Kumar Nandwana
44
0
0
15 Nov 2024
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S. Sakshi
Utkarsh Tyagi
Sonal Kumar
Ashish Seth
Ramaneswaran Selvakumar
Oriol Nieto
R. Duraiswami
Sreyan Ghosh
Dinesh Manocha
AuLLM
ELM
65
19
0
24 Oct 2024
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
Shangda Wu
Yashan Wang
Ruibin Yuan
Zhancheng Guo
Xu Tan
...
Yuanliang Dong
Jiafeng Liu
Xiaobing Li
Feng Yu
Maosong Sun
23
3
0
17 Oct 2024
Sound Check: Auditing Audio Datasets
William Agnew
Julia Barnett
Annie Chu
Rachel Hong
Michael Feffer
Robin Netzorg
Harry H. Jiang
Ezra Awumey
Sauvik Das
31
1
0
17 Oct 2024
Presto! Distilling Steps and Layers for Accelerating Music Generation
Zachary Novack
Ge Zhu
Jonah Casebeer
Julian McAuley
Taylor Berg-Kirkpatrick
Nicholas J. Bryan
45
4
0
07 Oct 2024
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval
Seungheon Doh
Minhee Lee
Dasaem Jeong
Juhan Nam
54
8
0
04 Oct 2024
CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation
Junda Wu
Warren Li
Zachary Novack
Amit Namburi
Carol Chen
Julian McAuley
VLM
24
0
0
03 Oct 2024
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Ilaria Manco
Justin Salamon
Oriol Nieto
18
0
0
17 Sep 2024
Evaluation of pretrained language models on music understanding
Yannis Vasilakis
Rachel M. Bittner
Johan Pauwels
18
1
0
17 Sep 2024
FakeMusicCaps: a Dataset for Detection and Attribution of Synthetic Music Generated via Text-to-Music Models
Luca Comanducci
Paolo Bestagini
Stefano Tubaro
32
6
0
16 Sep 2024
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene Experiences With Ambient Awareness And Personalization
Haoxuan Liu
Zihao Wang
HaoRong Hong
Youwei Feng
Jiaxin Yu
Han Diao
Yunfei Xu
K. Zhang
24
0
0
05 Sep 2024
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
76
7
0
01 Sep 2024
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
Yunwen Xia
Hui Fang
Emmanouil Benetos
Jie Zhang
Chong Long
Dmitry Bogdanov
AuLLM
41
1
0
02 Aug 2024
CoMMIT: Coordinated Instruction Tuning for Multimodal Large Language Models
Junda Wu
Xintong Li
Tong Yu
Yu-Xiang Wang
Xiang Chen
Jiuxiang Gu
Lina Yao
Jingbo Shang
Julian McAuley
37
0
0
29 Jul 2024
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
Zachary Novack
Amit Namburi
Jiaheng Dai
Hao-Wen Dong
Zhouhang Xie
Carol Chen
Julian McAuley
33
1
0
29 Jul 2024
Stable Audio Open
Zach Evans
Julian Parker
CJ Carr
Zack Zukowski
Josiah Taylor
Jordi Pons
61
36
0
19 Jul 2024
Long-form music generation with latent diffusion
Zach Evans
Julian Parker
CJ Carr
Zack Zukowski
Josiah Taylor
Jordi Pons
MGen
DiffM
30
10
0
16 Apr 2024
Audio Dialogues: Dialogues dataset for audio and music understanding
Arushi Goel
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
AuLLM
24
4
0
11 Apr 2024
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Yixiao Zhang
Yukara Ikemiya
Gus Xia
Naoki Murata
Marco A. Martínez Ramírez
Wei-Hsiang Liao
Yuki Mitsufuji
Simon Dixon
33
20
0
09 Feb 2024
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
30
36
0
24 Apr 2021
1