Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2405.02801
Cited By
Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models
5 May 2024
Tianze Xu
Jiajun Li
Xuesong Chen
Xinrui Yao
Shuchang Liu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Mozart's Touch: A Lightweight Multi-modal Music Generation Framework Based on Pre-Trained Large Models"
2 / 2 papers shown
Title
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
Junnan Li
Dongxu Li
Caiming Xiong
S. Hoi
MLLM
BDL
VLM
CLIP
378
4,010
0
28 Jan 2022
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
51
244
0
14 Jul 2021
1