Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2012.07331
Cited By
Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval
14 December 2020
Yuma Koizumi
Yasunori Ohishi
Daisuke Niizumi
Daiki Takeuchi
Masahiro Yasuda
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval"
36 / 36 papers shown
Title
VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents
Ryota Tanaka
Taichi Iki
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Jun Suzuki
VLM
47
0
0
14 Apr 2025
RANGE: Retrieval Augmented Neural Fields for Multi-Resolution Geo-Embeddings
A. Dhakal
S. Sastry
Subash Khanal
Adeel Ahmad
Eric Xing
Nathan Jacobs
48
0
0
27 Feb 2025
Audio-Language Datasets of Scenes and Events: A Survey
Gijs Wijngaard
Elia Formisano
Michele Esposito
M. Dumontier
79
2
0
10 Jan 2025
R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation
Fuda Ye
Shuangyin Li
Yongqi Zhang
L. Chen
30
0
0
19 Jun 2024
Retrieval-Augmented Generation for AI-Generated Content: A Survey
Penghao Zhao
Hailin Zhang
Qinhan Yu
Zhengren Wang
Yunteng Geng
Fangcheng Fu
Ling Yang
Wentao Zhang
Jie Jiang
Bin Cui
3DV
110
220
0
29 Feb 2024
Intelligent Director: An Automatic Framework for Dynamic Visual Composition using ChatGPT
Sixiao Zheng
Jingyang Huo
Yu Wang
Yanwei Fu
VGen
DiffM
29
1
0
24 Feb 2024
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
Jaeyeon Kim
Jaeyoon Jung
Jinjoo Lee
Sang Hoon Woo
CLIP
VLM
16
21
0
31 Jan 2024
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Jiaxi Lv
Yi Huang
Mingfu Yan
Jiancheng Huang
Jianzhuang Liu
Yifan Liu
Yafei Wen
Xiaoxin Chen
Shifeng Chen
VGen
DiffM
28
23
0
21 Nov 2023
Zero-shot audio captioning with audio-language model guidance and audio context keywords
Leonard Salewski
Stefan Fauth
A. Sophia Koepke
Zeynep Akata
16
10
0
14 Nov 2023
RECAP: Retrieval-Augmented Audio Captioning
Sreyan Ghosh
Sonal Kumar
Chandra Kiran Reddy Evuru
R. Duraiswami
Dinesh Manocha
VLM
62
17
0
18 Sep 2023
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision
Feiyang Xiao
Qiaoxi Zhu
Jian Guan
Xubo Liu
Haohe Liu
Kejia Zhang
Wenwu Wang
11
2
0
18 Sep 2023
Training Audio Captioning Models without Audio
Soham Deshmukh
Benjamin Elizalde
Dimitra Emmanouilidou
Bhiksha Raj
Rita Singh
Huaming Wang
19
18
0
14 Sep 2023
Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
11
6
0
23 Aug 2023
Improving Audio Caption Fluency with Automatic Error Correction
Hanxue Zhang
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
16
0
0
16 Jun 2023
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
11
10
0
02 Jun 2023
DirecT2V: Large Language Models are Frame-Level Directors for Zero-Shot Text-to-Video Generation
Susung Hong
Junyoung Seo
Heeseong Shin
Sung‐Jin Hong
Seung Wook Kim
DiffM
VGen
18
34
0
23 May 2023
Listen, Think, and Understand
Yuan Gong
Hongyin Luo
Alexander H. Liu
Leonid Karlinsky
James R. Glass
ELM
MLLM
LRM
29
135
0
18 May 2023
Efficient Audio Captioning Transformer with Patchout and Text Guidance
Thodoris Kouzelis
Grigoris Bastas
Athanasios Katsamanis
Alexandros Potamianos
ViT
10
6
0
06 Apr 2023
Prefix tuning for automated audio captioning
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
8
42
0
30 Mar 2023
eP-ALM: Efficient Perceptual Augmentation of Language Models
Mustafa Shukor
Corentin Dancette
Matthieu Cord
MLLM
VLM
24
29
0
20 Mar 2023
Retrieving Multimodal Information for Augmented Generation: A Survey
Ruochen Zhao
Hailin Chen
Weishi Wang
Fangkai Jiao
Do Xuan Long
...
Bosheng Ding
Xiaobao Guo
Minzhi Li
Xingxuan Li
Shafiq R. Joty
13
80
0
20 Mar 2023
BLAT: Bootstrapping Language-Audio Pre-training based on AudioSet Tag-guided Synthetic Data
Xuenan Xu
Zhiling Zhang
Zelin Zhou
Pingyue Zhang
Zeyu Xie
Mengyue Wu
Ke Zhu
CLIP
58
14
0
14 Mar 2023
Automated Audio Captioning with Epochal Difficult Captions for Curriculum Learning
Andrew Koh
Soham Dinesh Tiwari
Chng Eng Siong
17
1
0
04 Jun 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
17
37
0
12 May 2022
Beyond the Status Quo: A Contemporary Survey of Advances and Challenges in Audio Captioning
Xuenan Xu
Zeyu Xie
Mengyue Wu
K. Yu
21
13
0
11 May 2022
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
17
29
0
06 Mar 2022
Automatic Audio Captioning using Attention weighted Event based Embeddings
Swapnil Bhosale
Rupayan Chakraborty
Sunil Kumar Kopparapu
26
0
0
28 Jan 2022
Audio Retrieval with Natural Language Queries: A Benchmark Study
A. Sophia Koepke
Andreea-Maria Oncescu
João F. Henriques
Zeynep Akata
Samuel Albanie
20
100
0
17 Dec 2021
Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning
Benno Weck
Xavier Favory
K. Drossos
Xavier Serra
14
8
0
14 Oct 2021
Diverse Audio Captioning via Adversarial Training
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
GAN
43
28
0
13 Oct 2021
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
8
20
0
10 Aug 2021
Audio Captioning Transformer
Xinhao Mei
Xubo Liu
Qiushi Huang
Mark D. Plumbley
Wenwu Wang
ViT
18
77
0
21 Jul 2021
Audio Retrieval with Natural Language Queries
Andreea-Maria Oncescu
A. Sophia Koepke
João F. Henriques
Zeynep Akata
Samuel Albanie
17
77
0
05 May 2021
MusCaps: Generating Captions for Music Audio
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
30
36
0
24 Apr 2021
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
214
7,687
0
17 Aug 2015
Acoustic Scene Classification
D. Barchiesi
D. Giannoulis
D. Stowell
Mark D. Plumbley
98
403
0
13 Nov 2014
1