v1v2 (latest)

Automated Audio Captioning: An Overview of Recent Progress and New Challenges

EURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process.), 2022

12 May 2022

Papers citing "Automated Audio Captioning: An Overview of Recent Progress and New Challenges"

25 / 25 papers shown

Spatial-CLAP: Learning Spatially-Aware audio--text Embeddings for Multi-Source Conditions

113

18 Sep 2025

MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models

Gautam Siddharth Kashyap

VLM

109

16 Sep 2025

AC/DC: LLM-based Audio Comprehension via Dialogue Continuation

292

12 Jun 2025

CLAP-ART: Automated Audio Captioning with Semantic-rich Audio Representation Tokenizer

189

01 Jun 2025

Mellow: a small audio language model for reasoning

293

11 Mar 2025

Audio-Language Datasets of Scenes and Events: A SurveyIEEE Access (IEEE Access), 2024

469

10 Jan 2025

Describe Where You Are: Improving Noise-Robustness for Speech Emotion Recognition with Text Description of the EnvironmentIEEE Transactions on Affective Computing (IEEE Trans. Affective Comput.), 2024

229

25 Jul 2024

ParaCLAP -- Towards a general language-audio model for computational paralinguistic tasks

Xin Jing

Andreas Triantafyllopoulos

Björn Schuller

153

11 Jun 2024

AudioSetMix: Enhancing Audio-Language Datasets with LLM-Assisted Augmentations

David Xu

253

17 May 2024

ACES: Evaluating Automated Audio Captioning Models on the Semantics of Sounds

214

27 Mar 2024

EDTC: enhance depth of text comprehension in automated audio captioning

Liwen Tan

Yin Cao

Yi Zhou

207

27 Feb 2024

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker ExtractionIEEE Transactions on Cognitive and Developmental Systems (IEEE TCDS), 2023

Kay Chen Tan

349

11 Oct 2023

A Large-scale Dataset for Audio-Language Representation LearningACM Multimedia (ACM MM), 2023

373

20 Sep 2023

Audio Difference Learning for Audio CaptioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

184

15 Sep 2023

Separate Anything You DescribeIEEE Transactions on Audio, Speech, and Language Processing (TASLP), 2023

Yuxuan Wang

314

09 Aug 2023

Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary CaptionsInterspeech (Interspeech), 2023

Yifei Xin

Yuexian Zou

392

28 Jul 2023

Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizerEuropean Signal Processing Conference (EUSIPCO), 2023

Etienne Labbé

J. Pinquier

Thomas Pellegrini

211

02 May 2023

Graph Attention for Automated Audio CaptioningIEEE Signal Processing Letters (IEEE SPL), 2023

200

07 Apr 2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal ResearchIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

337

311

30 Mar 2023

Towards Generating Diverse Audio Captions via Adversarial TrainingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

296

05 Dec 2022

Impact of visual assistance for automated audio captioning

Wim Boes

Hugo Van hamme

202

18 Nov 2022

Investigations in Audio Captioning: Addressing Vocabulary Imbalance and Evaluating Suitability of Language-Centric Performance Metrics

Sandeep Reddy Kothinti

Dimitra Emmanouilidou

249

12 Nov 2022

Visually-Aware Audio Captioning With Adaptive Audio-Visual AttentionInterspeech (Interspeech), 2022

...

409

28 Oct 2022

Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio CaptioningIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022

283

11 May 2022

Local Information Assisted Attention-free Decoder for Audio CaptioningIEEE Signal Processing Letters (SPL), 2022

277

10 Jan 2022