ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2307.16372
  4. Cited By
LP-MusicCaps: LLM-Based Pseudo Music Captioning

LP-MusicCaps: LLM-Based Pseudo Music Captioning

31 July 2023
Seungheon Doh
Keunwoo Choi
Jongpil Lee
Juhan Nam
ArXivPDFHTML

Papers citing "LP-MusicCaps: LLM-Based Pseudo Music Captioning"

50 / 53 papers shown
Title
Versatile Framework for Song Generation with Prompt-based Control
Versatile Framework for Song Generation with Prompt-based Control
Y. Zhang
Wenxiang Guo
Changhao Pan
Z. Zhu
Ruiqi Li
...
Rongjie Huang
Ruiyuan Zhang
Zhiqing Hong
Ziyue Jiang
Zhou Zhao
68
1
0
27 Apr 2025
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
A Survey on Cross-Modal Interaction Between Music and Multimodal Data
Sifei Li
Mining Tan
Feier Shen
Minyan Luo
Zijiao Yin
Fan Tang
W. Dong
Changsheng Xu
55
0
0
17 Apr 2025
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling
ALMTokenizer: A Low-bitrate and Semantic-rich Audio Codec Tokenizer for Audio Language Modeling
Dongchao Yang
Songxiang Liu
Haohan Guo
Jiankun Zhao
Yuanyuan Wang
...
Xubo Liu
Xueyuan Chen
Xu Tan
Xixin Wu
H. Meng
24
0
0
14 Apr 2025
Qwen2.5-Omni Technical Report
Qwen2.5-Omni Technical Report
Jin Xu
Zhifang Guo
Jinzheng He
Hangrui Hu
Ting He
...
K. Dang
Bin Zhang
X. Wang
Yunfei Chu
Junyang Lin
VGen
AuLLM
84
12
0
26 Mar 2025
Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities
Sreyan Ghosh
Zhifeng Kong
Sonal Kumar
S. Sakshi
Jaehyeon Kim
Wei Ping
Rafael Valle
Dinesh Manocha
Bryan Catanzaro
MLLM
AuLLM
LRM
41
4
0
06 Mar 2025
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation
Z. Liu
Shuangrui Ding
Zhixiong Zhang
Xiaoyi Dong
Pan Zhang
Yuhang Zang
Y. Cao
D. Lin
Jiaqi Wang
69
0
0
18 Feb 2025
Audio-Language Models for Audio-Centric Tasks: A survey
Yi Su
Jisheng Bai
Qisheng Xu
Kele Xu
Yong Dou
AuLLM
90
1
0
28 Jan 2025
AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just
  "Sounds Great!"
AI TrackMate: Finally, Someone Who Will Give Your Music More Than Just "Sounds Great!"
Yi-Lin Jiang
Chia-Ho Hsiung
Yen-Tung Yeh
Lu-Rong Chen
Bo-Yu Chen
59
0
0
09 Dec 2024
Combining Genre Classification and Harmonic-Percussive Features with
  Diffusion Models for Music-Video Generation
Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
Leonardo Pina
Yongmin Li
VGen
DiffM
71
0
0
07 Dec 2024
Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations
Sing-On-Your-Beat: Simple Text-Controllable Accompaniment Generations
Quoc-Huy Trinh
Minh-Van Nguyen
Trong-Hieu Nguyen-Mau
Khoa Tran
Thanh Do
18
0
0
03 Nov 2024
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
S. Sakshi
Utkarsh Tyagi
Sonal Kumar
Ashish Seth
Ramaneswaran Selvakumar
Oriol Nieto
R. Duraiswami
Sreyan Ghosh
Dinesh Manocha
AuLLM
ELM
59
19
0
24 Oct 2024
OpenMU: Your Swiss Army Knife for Music Understanding
OpenMU: Your Swiss Army Knife for Music Understanding
Mengjie Zhao
Zhi-Wei Zhong
Zhuoyuan Mao
Shiqi Yang
Wei-Hsiang Liao
Shusuke Takahashi
Hiromi Wakaki
Yuki Mitsufuji
OSLM
33
0
0
21 Oct 2024
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models
Shangda Wu
Yashan Wang
Ruibin Yuan
Zhancheng Guo
Xu Tan
...
Yuanliang Dong
Jiafeng Liu
Xiaobing Li
Feng Yu
Maosong Sun
15
3
0
17 Oct 2024
EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data
  Generation
EmotionCaps: Enhancing Audio Captioning Through Emotion-Augmented Data Generation
Mithun Manivannan
Vignesh Nethrapalli
Mark Cartwright
11
0
0
15 Oct 2024
Enriching Music Descriptions with a Finetuned-LLM and Metadata for
  Text-to-Music Retrieval
Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval
Seungheon Doh
Minhee Lee
Dasaem Jeong
Juhan Nam
41
8
0
04 Oct 2024
CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical
  Temporal Structure Augmentation
CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation
Junda Wu
Warren Li
Zachary Novack
Amit Namburi
Carol Chen
Julian McAuley
VLM
19
0
0
03 Oct 2024
Generating Symbolic Music from Natural Language Prompts using an
  LLM-Enhanced Dataset
Generating Symbolic Music from Natural Language Prompts using an LLM-Enhanced Dataset
Weihan Xu
Julian McAuley
Taylor Berg-Kirkpatrick
Shlomo Dubnov
Hao-Wen Dong
15
0
0
02 Oct 2024
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient
  Music-Text Representation Learning
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Ilaria Manco
Justin Salamon
Oriol Nieto
18
1
0
17 Sep 2024
Prevailing Research Areas for Music AI in the Era of Foundation Models
Prevailing Research Areas for Music AI in the Era of Foundation Models
Megan Wei
M. Modrzejewski
Aswin Sivaraman
Dorien Herremans
MedIm
16
0
0
14 Sep 2024
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene
  Experiences With Ambient Awareness And Personalization
MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene Experiences With Ambient Awareness And Personalization
Haoxuan Liu
Zihao Wang
HaoRong Hong
Youwei Feng
Jiaxin Yu
Han Diao
Yunfei Xu
K. Zhang
16
0
0
05 Sep 2024
FLUX that Plays Music
FLUX that Plays Music
Zhengcong Fei
Mingyuan Fan
Changqian Yu
Junshi Huang
71
7
0
01 Sep 2024
Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design
Music2P: A Multi-Modal AI-Driven Tool for Simplifying Album Cover Design
Yuanyuan Zhang
Lingxiao Li
Ji-Eun Han
Wonjin Yang
Zhi-Qi Cheng
21
0
0
03 Aug 2024
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language
  Models
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models
Yunwen Xia
Hui Fang
Emmanouil Benetos
Jie Zhang
Chong Long
Dmitry Bogdanov
AuLLM
30
1
0
02 Aug 2024
MMTrail: A Multimodal Trailer Video Dataset with Language and Music
  Descriptions
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions
Xiaowei Chi
Yatian Wang
Aosong Cheng
Pengjun Fang
Zeyue Tian
...
Wenhan Luo
Qifeng Chen
Shanghang Zhang
Qi-fei Liu
Yi-Ting Guo
63
1
0
30 Jul 2024
Futga: Towards Fine-grained Music Understanding through
  Temporally-enhanced Generative Augmentation
Futga: Towards Fine-grained Music Understanding through Temporally-enhanced Generative Augmentation
Junda Wu
Zachary Novack
Amit Namburi
Jiaheng Dai
Hao-Wen Dong
Zhouhang Xie
Carol Chen
Julian McAuley
25
0
0
29 Jul 2024
I can listen but cannot read: An evaluation of two-tower multimodal
  systems for instrument recognition
I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition
Yannis Vasilakis
Rachel M. Bittner
Johan Pauwels
22
0
0
25 Jul 2024
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Accompanied Singing Voice Synthesis with Fully Text-controlled Melody
Ruiqi Li
Zhiqing Hong
Yongqi Wang
Lichao Zhang
Rongjie Huang
Siqi Zheng
Zhou Zhao
18
3
0
02 Jul 2024
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
Ivan Villa-Renteria
Mason L. Wang
Zachary Shah
Zhe Li
Soohyun Kim
Neelesh Ramachandran
Mert Pilanci
21
0
0
27 Jun 2024
Zero-Shot Audio Captioning Using Soft and Hard Prompts
Zero-Shot Audio Captioning Using Soft and Hard Prompts
Yiming Zhang
Xuenan Xu
Ruoyi Du
Haohe Liu
Yuan Dong
Zheng-Hua Tan
Wenwu Wang
Zhanyu Ma
VLM
20
0
0
10 Jun 2024
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT
Le Zhuo
Ruoyi Du
Han Xiao
Yangguang Li
Dongyang Liu
...
Wanli Ouyang
Ziwei Liu
Yu Qiao
Hongsheng Li
Peng Gao
39
5
0
05 Jun 2024
MidiCaps: A large-scale MIDI dataset with text captions
MidiCaps: A large-scale MIDI dataset with text captions
J. Melechovský
Abhinaba Roy
Dorien Herremans
16
2
0
04 Jun 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Patrick Emami
Zhaonan Li
Saumya Sinha
Truc Nguyen
45
1
0
30 May 2024
QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
QA-MDT: Quality-aware Masked Diffusion Transformer for Enhanced Music Generation
Chang Li
Ruoyu Wang
Lijuan Liu
Jun Du
Yixuan Sun
Zilu Guo
Zhenrong Zhang
Yuan Jiang
J. Gao
Feng Ma
28
0
0
24 May 2024
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals
  and Accompaniment
Text-to-Song: Towards Controllable Music Generation Incorporating Vocals and Accompaniment
Zhiqing Hong
Rongjie Huang
Xize Cheng
Yongqi Wang
Ruiqi Li
Fuming You
Zhou Zhao
Zhimeng Zhang
16
3
0
14 Apr 2024
Audio Dialogues: Dialogues dataset for audio and music understanding
Audio Dialogues: Dialogues dataset for audio and music understanding
Arushi Goel
Zhifeng Kong
Rafael Valle
Bryan Catanzaro
AuLLM
16
4
0
11 Apr 2024
MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in
  Social Discussions
MoralBERT: A Fine-Tuned Language Model for Capturing Moral Values in Social Discussions
Vjosa Preniqi
Iacopo Ghinassi
Julia Ive
C. Saitis
Kyriaki Kalimeri
24
1
0
12 Mar 2024
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and
  Dialogue Abilities
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong
Arushi Goel
Rohan Badlani
Wei Ping
Rafael Valle
Bryan Catanzaro
AuLLM
LM&MA
MLLM
56
73
0
02 Feb 2024
LightHouse: A Survey of AGI Hallucination
LightHouse: A Survey of AGI Hallucination
Feng Wang
LRM
HILM
VLM
11
1
0
08 Jan 2024
WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
Benno Weck
Holger Kirchhoff
Peter Grosche
Xavier Serra
VLM
8
1
0
14 Dec 2023
Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with
  Spatial Relation Matching
Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
Meng Chu
Zhedong Zheng
Wei Ji
Tingyu Wang
Tat-Seng Chua
10
2
0
21 Nov 2023
The Song Describer Dataset: a Corpus of Audio Captions for
  Music-and-Language Evaluation
The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation
Ilaria Manco
Benno Weck
Seungheon Doh
Minz Won
Yixiao Zhang
...
Philip Tovstogan
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
Juhan Nam
8
14
0
16 Nov 2023
SALMONN: Towards Generic Hearing Abilities for Large Language Models
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Changli Tang
Wenyi Yu
Guangzhi Sun
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
LM&MA
AuLLM
11
88
0
20 Oct 2023
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative
  Editing
Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing
Yixiao Zhang
Akira Maezawa
Gus Xia
Kazuhiko Yamamoto
Simon Dixon
39
15
0
19 Oct 2023
LLark: A Multimodal Instruction-Following Language Model for Music
LLark: A Multimodal Instruction-Following Language Model for Music
Josh Gardner
Simon Durand
Daniel Stoller
Rachel M. Bittner
AuLLM
13
5
0
11 Oct 2023
MusiLingo: Bridging Music and Text with Pre-trained Language Models for
  Music Captioning and Query Response
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
Zihao Deng
Yi Ma
Yudong Liu
Rongchen Guo
Ge Zhang
Wenhu Chen
Wenhao Huang
Emmanouil Benetos
MLLM
AuLLM
13
8
0
15 Sep 2023
A Survey of Hallucination in Large Foundation Models
A Survey of Hallucination in Large Foundation Models
Vipula Rawte
A. Sheth
Amitava Das
HILM
LRM
6
213
0
12 Sep 2023
Music Understanding LLaMA: Advancing Text-to-Music Generation with
  Question Answering and Captioning
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning
Shansong Liu
Atin Sakkeer Hussain
Chenshuo Sun
Yin Shan
MLLM
11
27
0
22 Aug 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for
  Audio-Language Multimodal Research
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
30
190
0
30 Mar 2023
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
301
11,730
0
04 Mar 2022
Music Playlist Title Generation: A Machine-Translation Approach
Music Playlist Title Generation: A Machine-Translation Approach
Seungheon Doh
Junwon Lee
Juhan Nam
13
6
0
03 Oct 2021
12
Next