ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.05681
  4. Cited By
Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior
  Matrix Revised Loss

Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss

10 March 2023
Yifei Xin
Dongchao Yang
Yuexian Zou
ArXivPDFHTML

Papers citing "Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss"

21 / 21 papers shown
Title
Language-based Audio Retrieval with Co-Attention Networks
Language-based Audio Retrieval with Co-Attention Networks
Haoran Sun
Z. Wang
Qiuyi Chen
Jianjun Chen
Jia Wang
Haiyang Zhang
29
0
0
31 Dec 2024
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Dissecting Temporal Understanding in Text-to-Audio Retrieval
Andreea-Maria Oncescu
João F. Henriques
A. Sophia Koepke
19
2
0
01 Sep 2024
Estimated Audio-Caption Correspondences Improve Language-Based Audio
  Retrieval
Estimated Audio-Caption Correspondences Improve Language-Based Audio Retrieval
Paul Primus
Florian Schmid
Gerhard Widmer
29
2
0
21 Aug 2024
Fusing Audio and Metadata Embeddings Improves Language-based Audio
  Retrieval
Fusing Audio and Metadata Embeddings Improves Language-based Audio Retrieval
Paul Primus
Gerhard Widmer
45
3
0
22 Jun 2024
Towards Spoken Language Understanding via Multi-level Multi-grained
  Contrastive Learning
Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning
Xuxin Cheng
Wanshi Xu
Zhihong Zhu
Hongxiang Li
Yuexian Zou
59
13
0
31 May 2024
Distance Sampling-based Paraphraser Leveraging ChatGPT for Text Data
  Manipulation
Distance Sampling-based Paraphraser Leveraging ChatGPT for Text Data Manipulation
Yoori Oh
Yoseob Han
Kyogu Lee
32
1
0
01 May 2024
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining
T-CLAP: Temporal-Enhanced Contrastive Language-Audio Pretraining
Yiitan Yuan
Zhuo Chen
Xubo Liu
Haohe Liu
Xuenan Xu
Dongya Jia
Yuanzhe Chen
Mark D. Plumbley
Wenwu Wang
CLIP
VLM
35
9
0
27 Apr 2024
Multiscale Matching Driven by Cross-Modal Similarity Consistency for
  Audio-Text Retrieval
Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval
Qian Wang
Jia-Chen Gu
Zhen-Hua Ling
30
2
0
15 Mar 2024
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and
  Instruction Tuning
MINT: Boosting Audio-Language Model via Multi-Target Pre-Training and Instruction Tuning
Hang Zhao
Yifei Xin
Zhesong Yu
Bilei Zhu
Lu Lu
Zejun Ma
AuLLM
26
4
0
12 Feb 2024
Masked Audio Modeling with CLAP and Multi-Objective Learning
Masked Audio Modeling with CLAP and Multi-Objective Learning
Yifei Xin
Xiulian Peng
Yan Lu
42
8
0
29 Jan 2024
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model
  for Multimodal Processing
CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing
Xianghu Yue
Xiaohai Tian
Lu Lu
Malu Zhang
Zhizheng Wu
Haizhou Li
25
0
0
22 Jan 2024
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild
FunnyNet-W: Multimodal Learning of Funny Moments in Videos in the Wild
Zhi-Song Liu
Robin Courant
Vicky Kalogeiton
25
6
0
08 Jan 2024
A Language-based solution to enable Metaverse Retrieval
A Language-based solution to enable Metaverse Retrieval
Ali Abdari
Alex Falcon
Giuseppe Serra
DiffM
21
4
0
22 Dec 2023
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for
  Improving ASR Robustness in Spoken Language Understanding
ML-LMCL: Mutual Learning and Large-Margin Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding
Xuxin Cheng
Bowen Cao
Qichen Ye
Zhihong Zhu
Hongxiang Li
Yuexian Zou
19
25
0
19 Nov 2023
Sound of Story: Multi-modal Storytelling with Audio
Sound of Story: Multi-modal Storytelling with Audio
Jaeyeon Bae
Seokhoon Jeong
Seokun Kang
Namgi Han
Jae-Yon Lee
Hyounghun Kim
Taehwan Kim
21
2
0
30 Oct 2023
Contrastive Latent Space Reconstruction Learning for Audio-Text
  Retrieval
Contrastive Latent Space Reconstruction Learning for Audio-Text Retrieval
Kaiyi Luo
Xulong Zhang
Jianzong Wang
Huaxiong Li
Ning Cheng
Jing Xiao
61
2
0
16 Sep 2023
Killing two birds with one stone: Can an audio captioning system also be
  used for audio-text retrieval?
Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?
Etienne Labbé
Thomas Pellegrini
J. Pinquier
8
5
0
29 Aug 2023
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Yifei Xin
Yuexian Zou
39
9
0
28 Jul 2023
ONE-PEACE: Exploring One General Representation Model Toward Unlimited
  Modalities
ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Peng Wang
Shijie Wang
Junyang Lin
Shuai Bai
Xiaohuan Zhou
Jingren Zhou
Xinggang Wang
Chang Zhou
VLM
MLLM
ObjD
16
114
0
18 May 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for
  Audio-Language Multimodal Research
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
43
192
0
30 Mar 2023
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound
  Classification and Detection
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Ke Chen
Xingjian Du
Bilei Zhu
Zejun Ma
Taylor Berg-Kirkpatrick
Shlomo Dubnov
ViT
114
264
0
02 Feb 2022
1