Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2307.15344
Cited By
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
28 July 2023
Yifei Xin
Yuexian Zou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions"
7 / 7 papers shown
Title
DiffATR: Diffusion-based Generative Modeling for Audio-Text Retrieval
Yifei Xin
Xuxin Cheng
Zhihong Zhu
Xusheng Yang
Yuexian Zou
DiffM
13
0
0
16 Sep 2024
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
Xuxin Cheng
Wanshi Xu
Zhihong Zhu
Hongxiang Li
Yuexian Zou
14
13
0
31 May 2024
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
Hang Zhao
Yifei Xin
Zhesong Yu
Bilei Zhu
Lu Lu
Zejun Ma
AuLLM
10
0
0
12 Feb 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
18
4
0
29 Jan 2024
Improving Weakly Supervised Sound Event Detection with Causal Intervention
Yifei Xin
Dongchao Yang
Fan Cui
Yujun Wang
Yuexian Zou
CML
33
6
0
10 Mar 2023
Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu
Haipeng Luo
Bo Fang
Jingdong Wang
Wanli Ouyang
78
47
0
31 Dec 2022
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
98
181
0
02 Feb 2022
1