Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2110.06691
Cited By
v1
v2 (latest)
Diverse Audio Captioning via Adversarial Training
13 October 2021
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
GAN
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Diverse Audio Captioning via Adversarial Training"
20 / 20 papers shown
Title
From Contrast to Commonality: Audio Commonality Captioning for Enhanced Audio-Text Cross-modal Understanding in Multimodal LLMs
Yuhang Jia
Xu Zhang
Yong Qin
Yang Chen
Shiwan Zhao
VLM
63
0
0
03 Aug 2025
Extremely Simple Out-of-distribution Detection for Audio-visual Generalized Zero-shot Learning
Yang Liu
Xinming Zhang
Jiale Du
Xinbo Gao
Jungong Han
OODD
163
0
0
28 Mar 2025
Mellow: a small audio language model for reasoning
Soham Deshmukh
Satvik Dixit
Rita Singh
Bhiksha Raj
AuLLM
ReLM
LRM
187
16
0
11 Mar 2025
Audio-Language Datasets of Scenes and Events: A Survey
IEEE Access (IEEE Access), 2024
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
314
6
0
10 Jan 2025
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Xuenan Xu
Haohe Liu
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
150
4
0
19 Jul 2024
AMA-LSTM: Pioneering Robust and Fair Financial Audio Analysis for Stock Volatility Prediction
Shengkun Wang
Taoran Ji
Jianfeng He
Mariam Almutairi
Dan Wang
Linhan Wang
Min Zhang
Chang-Tien Lu
100
4
0
03 Jul 2024
On the Audio Hallucinations in Large Audio-Video Language Models
Taichi Nishimura
Shota Nakada
Masayoshi Kondo
VLM
126
11
0
18 Jan 2024
Training Audio Captioning Models without Audio
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Soham Deshmukh
Benjamin Elizalde
Dimitra Emmanouilidou
Bhiksha Raj
Rita Singh
Huaming Wang
135
24
0
14 Sep 2023
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Etienne Labbé
Thomas Pellegrini
J. Pinquier
165
17
0
01 Sep 2023
Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
144
8
0
23 Aug 2023
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Interspeech (Interspeech), 2023
Jianyuan Sun
Xubo Liu
Xinhao Mei
V. Kılıç
Mark D. Plumbley
Wenwu Wang
111
3
0
30 May 2023
Towards Generating Diverse Audio Captions via Adversarial Training
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
159
4
0
05 Dec 2022
Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention
Interspeech (Interspeech), 2022
Xubo Liu
Qiushi Huang
Xinhao Mei
Haohe Liu
Qiuqiang Kong
...
Yu Zhang
Lilian H. Y. Tang
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
258
25
0
28 Oct 2022
Text-to-Audio Grounding Based Novel Metric for Evaluating Audio Caption Similarity
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
102
1
0
03 Oct 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
EURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process.), 2022
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
164
52
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
169
19
0
11 May 2022
Automated Audio Captioning using Audio Event Clues
Aycsegul Ozkaya Eren
M. Sert
90
0
0
18 Apr 2022
Separate What You Describe: Language-Queried Audio Source Separation
Interspeech (Interspeech), 2022
Xubo Liu
Haohe Liu
Qiuqiang Kong
Xinhao Mei
Jinzheng Zhao
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
164
83
0
28 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
European Signal Processing Conference (EUSIPCO), 2022
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
175
32
0
06 Mar 2022
Local Information Assisted Attention-free Decoder for Audio Captioning
IEEE Signal Processing Letters (SPL), 2022
Feiyang Xiao
Jian Guan
Haiyan Lan
Qiaoxi Zhu
Wenwu Wang
153
12
0
10 Jan 2022
1