ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.11457
  4. Cited By
Investigating Local and Global Information for Automated Audio
  Captioning with Transfer Learning

Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning

23 February 2021
Xuenan Xu
Heinrich Dinkel
Mengyue Wu
Zeyu Xie
Kai Yu
ArXivPDFHTML

Papers citing "Investigating Local and Global Information for Automated Audio Captioning with Transfer Learning"

40 / 40 papers shown
Title
Contrasting Deep Learning Models for Direct Respiratory Insufficiency
  Detection Versus Blood Oxygen Saturation Estimation
Contrasting Deep Learning Models for Direct Respiratory Insufficiency Detection Versus Blood Oxygen Saturation Estimation
M. Gauy
Natalia Hitomi Koza
Ricardo Mikio Morita
Gabriel Rocha Stanzione
Arnaldo Cândido Júnior
L. Berti
A. S. Levin
E. Sabino
F. Svartman
Marcelo Finger
38
0
0
30 Jul 2024
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Efficient Audio Captioning with Encoder-Level Knowledge Distillation
Xuenan Xu
Haohe Liu
Mengyue Wu
Wenwu Wang
Mark D. Plumbley
40
1
0
19 Jul 2024
Zero-shot audio captioning with audio-language model guidance and audio
  context keywords
Zero-shot audio captioning with audio-language model guidance and audio context keywords
Leonard Salewski
Stefan Fauth
A. Sophia Koepke
Zeynep Akata
19
10
0
14 Nov 2023
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Avamarie Brueggeman
Andrea Madotto
Zhaojiang Lin
Tushar Nagarajan
Matt Smith
...
Peyman Heidari
Yue Liu
Kavya Srinet
Babak Damavandi
Anuj Kumar
MLLM
32
93
0
27 Sep 2023
Weakly-supervised Automated Audio Captioning via text only training
Weakly-supervised Automated Audio Captioning via text only training
Theodoros Kouzelis
V. Katsouros
CLIP
27
6
0
21 Sep 2023
RECAP: Retrieval-Augmented Audio Captioning
RECAP: Retrieval-Augmented Audio Captioning
Sreyan Ghosh
Sonal Kumar
Chandra Kiran Reddy Evuru
R. Duraiswami
Dinesh Manocha
VLM
64
17
0
18 Sep 2023
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
UnIVAL: Unified Model for Image, Video, Audio and Language Tasks
Mustafa Shukor
Corentin Dancette
Alexandre Ramé
Matthieu Cord
MoMe
MLLM
61
42
0
30 Jul 2023
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Improving Audio-Text Retrieval via Hierarchical Cross-Modal Interaction and Auxiliary Captions
Yifei Xin
Yuexian Zou
39
9
0
28 Jul 2023
Improving Audio Caption Fluency with Automatic Error Correction
Improving Audio Caption Fluency with Automatic Error Correction
Hanxue Zhang
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
21
0
0
16 Jun 2023
Enhance Temporal Relations in Audio Captioning with Sound Event
  Detection
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
21
10
0
02 Jun 2023
Dual Transformer Decoder based Features Fusion Network for Automated
  Audio Captioning
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Jianyuan Sun
Xubo Liu
Xinhao Mei
V. Kılıç
Mark D. Plumbley
Wenwu Wang
20
3
0
30 May 2023
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and
  Dataset
VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Sihan Chen
Handong Li
Qunbo Wang
Zijia Zhao
Ming-Ting Sun
Xinxin Zhu
J. Liu
32
97
0
29 May 2023
Listen, Think, and Understand
Listen, Think, and Understand
Yuan Gong
Hongyin Luo
Alexander H. Liu
Leonid Karlinsky
James R. Glass
ELM
MLLM
LRM
32
136
0
18 May 2023
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
Sihan Chen
Xingjian He
Longteng Guo
Xinxin Zhu
Weining Wang
Jinhui Tang
Jinhui Tang
VLM
31
102
0
17 Apr 2023
Prefix tuning for automated audio captioning
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
8
42
0
30 Mar 2023
eP-ALM: Efficient Perceptual Augmentation of Language Models
eP-ALM: Efficient Perceptual Augmentation of Language Models
Mustafa Shukor
Corentin Dancette
Matthieu Cord
MLLM
VLM
32
29
0
20 Mar 2023
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet
  Tag-guided Synthetic Data
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data
Xuenan Xu
Zhiling Zhang
Zelin Zhou
Pingyue Zhang
Zeyu Xie
Mengyue Wu
Ke Zhu
CLIP
60
14
0
14 Mar 2023
Towards Generating Diverse Audio Captions via Adversarial Training
Towards Generating Diverse Audio Captions via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
33
2
0
05 Dec 2022
Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention
Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention
Xubo Liu
Qiushi Huang
Xinhao Mei
Haohe Liu
Qiuqiang Kong
...
Yu Zhang
Lilian H. Y. Tang
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
38
18
0
28 Oct 2022
Automated Audio Captioning via Fusion of Low- and High- Dimensional
  Features
Automated Audio Captioning via Fusion of Low- and High- Dimensional Features
Jianyuan Sun
Xubo Liu
Xinhao Mei
Mark D. Plumbley
V. Kılıç
Wenwu Wang
25
3
0
10 Oct 2022
An investigation on selecting audio pre-trained models for audio
  captioning
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
21
0
0
12 Aug 2022
Automated Audio Captioning with Epochal Difficult Captions for
  Curriculum Learning
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Andrew Koh
Soham Dinesh Tiwari
Chng Eng Siong
17
1
0
04 Jun 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
27
37
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges
  in Audio Captioning
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
29
13
0
11 May 2022
Automated Audio Captioning using Audio Event Clues
Automated Audio Captioning using Audio Event Clues
Aycsegul Ozkaya Eren
M. Sert
21
0
0
18 Apr 2022
Caption Feature Space Regularization for Audio Captioning
Caption Feature Space Regularization for Audio Captioning
Yiming Zhang
Hong Yu
Ruoyi Du
Zhanyu Ma
Yuan Dong
16
1
0
18 Apr 2022
A Temporal-oriented Broadcast ResNet for COVID-19 Detection
A Temporal-oriented Broadcast ResNet for COVID-19 Detection
Xin Jing
Shuo Liu
Emilia Parada-Cabaleiro
Andreas Triantafyllopoulos
Meishu Song
Zijiang Yang
Björn W. Schuller
35
2
0
31 Mar 2022
Interactive Audio-text Representation for Automated Audio Captioning
  with Contrastive Learning
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
26
21
0
29 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
25
29
0
06 Mar 2022
Automatic Audio Captioning using Attention weighted Event based
  Embeddings
Automatic Audio Captioning using Attention weighted Event based Embeddings
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
26
0
0
28 Jan 2022
Audio Retrieval with Natural Language Queries: A Benchmark Study
Audio Retrieval with Natural Language Queries: A Benchmark Study
A. Sophia Koepke
Andreea-Maria Oncescu
João F. Henriques
Zeynep Akata
Samuel Albanie
22
100
0
17 Dec 2021
Evaluating Off-the-Shelf Machine Listening and Natural Language Models
  for Automated Audio Captioning
Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning
Benno Weck
Xavier Favory
K. Drossos
Xavier Serra
16
8
0
14 Oct 2021
Diverse Audio Captioning via Adversarial Training
Diverse Audio Captioning via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
GAN
43
28
0
13 Oct 2021
Improving the Performance of Automated Audio Captioning via Integrating
  the Acoustic and Semantic Information
Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information
Zhongjie Ye
Helin Wang
Dongchao Yang
Yuexian Zou
40
27
0
12 Oct 2021
Can Audio Captions Be Evaluated with Image Caption Metrics?
Can Audio Captions Be Evaluated with Image Caption Metrics?
Zelin Zhou
Zhiling Zhang
Xuenan Xu
Zeyu Xie
Mengyue Wu
Kenny Q. Zhu
30
41
0
10 Oct 2021
Automated Audio Captioning using Transfer Learning and Reconstruction
  Latent Space Similarity Regularization
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
14
20
0
10 Aug 2021
An Encoder-Decoder Based Audio Captioning System With Transfer and
  Reinforcement Learning
An Encoder-Decoder Based Audio Captioning System With Transfer and Reinforcement Learning
Xinhao Mei
Qiushi Huang
Xubo Liu
Gengyun Chen
Jingqian Wu
...
Tom Ko
H. Tang
Xingkun Shao
Mark D. Plumbley
Wenwu Wang
20
53
0
05 Aug 2021
Audio Captioning Transformer
Audio Captioning Transformer
Xinhao Mei
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
ViT
31
77
0
21 Jul 2021
Audio Retrieval with Natural Language Queries
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
19
77
0
05 May 2021
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,923
0
17 Aug 2015
1