Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.10159
Cited By
Joint Music and Language Attention Models for Zero-shot Music Tagging
16 October 2023
Xingjian Du
Zhesong Yu
Jiaju Lin
Bilei Zhu
Qiuqiang Kong
BDL
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Joint Music and Language Attention Models for Zero-shot Music Tagging"
3 / 3 papers shown
Title
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
114
262
0
02 Feb 2022
SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing
Junyi Ao
Rui Wang
Long Zhou
Chengyi Wang
Shuo Ren
...
Yu Zhang
Zhihua Wei
Yao Qian
Jinyu Li
Furu Wei
110
192
0
14 Oct 2021
1