Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1612.00370
Cited By
v1
v2
v3
v4 (latest)
Improved Image Captioning via Policy Gradient optimization of SPIDEr
1 December 2016
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Improved Image Captioning via Policy Gradient optimization of SPIDEr"
50 / 232 papers shown
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
International Conference on Image Analysis and Processing (ICIAP), 2023
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
257
7
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
205
11
0
20 May 2023
BOLT: Fast Energy-based Controlled Text Generation with Tunable Biases
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xin Liu
Muhammad Khalifa
Lu Wang
328
23
0
19 May 2023
Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
European Signal Processing Conference (EUSIPCO), 2023
Etienne Labbé
J. Pinquier
Thomas Pellegrini
205
5
0
02 May 2023
Towards Explainable and Safe Conversational Agents for Mental Health: A Survey
Surjodeep Sarkar
Manas Gaur
L. Chen
Muskan Garg
Biplav Srivastava
B. Dongaonkar
AI4MH
158
4
0
25 Apr 2023
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning
International Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Mizhaan Prajit Maniyar
Akash Mondal
Prashanth L.A.
S. Bhatnagar
183
4
0
21 Apr 2023
Graph Attention for Automated Audio Captioning
IEEE Signal Processing Letters (IEEE SPL), 2023
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Wenwu Wang
197
11
0
07 Apr 2023
Prefix tuning for automated audio captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
353
53
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
337
306
0
30 Mar 2023
ImageAssist: Tools for Enhancing Touchscreen-Based Image Exploration Systems for Blind and Low Vision Users
International Conference on Human Factors in Computing Systems (CHI), 2023
Vishnu Nair
Han Zhu
Brian A. Smith
158
27
0
17 Feb 2023
Semantics-Empowered Communication: A Tutorial-cum-Survey
Zhilin Lu
Rongpeng Li
Kun Lu
Xianfu Chen
Ekram Hossain
Zhifeng Zhao
Honggang Zhang
528
23
0
16 Dec 2022
Impact of visual assistance for automated audio captioning
Wim Boes
Hugo Van hamme
192
1
0
18 Nov 2022
Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2022
Etienne Labbé
Thomas Pellegrini
J. Pinquier
106
5
0
14 Nov 2022
Exploring Train and Test-Time Augmentations for Audio-Language Learning
Eungbeom Kim
Jinhee Kim
Yoori Oh
Kyungsu Kim
Minju Park
Jaeheon Sim
J. Lee
Kyogu Lee
167
16
0
31 Oct 2022
Hybrid Reinforced Medical Report Generation with M-Linear Attention and Repetition Penalty
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Wenting Xu
Zhenghua Xu
Junyang Chen
Chang Qi
Thomas Lukasiewicz
MedIm
174
15
0
14 Oct 2022
Automated Audio Captioning via Fusion of Low- and High- Dimensional Features
Jianyuan Sun
Xubo Liu
Xinhao Mei
Mark D. Plumbley
V. Kılıç
Wenwu Wang
176
3
0
10 Oct 2022
Vision+X: A Survey on Multimodal Learning in the Light of Data
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Ye Zhu
Yuehua Wu
Andrii Zadaianchuk
Yan Yan
354
38
0
05 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
565
279
0
03 Oct 2022
Paraphrasing Is All You Need for Novel Object Captioning
Neural Information Processing Systems (NeurIPS), 2022
Cheng Yang
Yifan Hao
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
184
6
0
25 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
AAAI Conference on Artificial Intelligence (AAAI), 2022
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
189
15
0
21 Sep 2022
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
126
0
0
12 Aug 2022
Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?
P. Bongini
Federico Becattini
Marco Bertini
206
19
0
25 Jul 2022
Rethinking the Reference-based Distinctive Image Captioning
ACM Multimedia (ACM MM), 2022
Yangjun Mao
Long Chen
Zhihong Jiang
Dong Zhang
Zhimeng Zhang
Jian Shao
Jun Xiao
DiffM
225
23
0
22 Jul 2022
Efficient Modeling of Future Context for Image Captioning
ACM Multimedia (ACM MM), 2022
Zhengcong Fei
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
208
16
0
22 Jul 2022
Automated Audio Captioning and Language-Based Audio Retrieval
Clive Gomes
Hyejin Park
Patrick Kollman
Yi-Zhe Song
Iffanice Houndayi
Ankit Parag Shah
297
1
0
08 Jul 2022
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
EURASIP Journal on Audio, Speech, and Music Processing (EURASIP J. Audio Speech Music Process.), 2022
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
290
54
0
12 May 2022
Caption Feature Space Regularization for Audio Captioning
Yiming Zhang
Hong Yu
Ruoyi Du
Zhanyu Ma
Yuan Dong
202
3
0
18 Apr 2022
Towards Lightweight Transformer via Group-wise Transformation for Vision-and-Language Tasks
IEEE Transactions on Image Processing (IEEE TIP), 2022
Gen Luo
Weihao Ye
Xiaoshuai Sun
Yan Wang
Liujuan Cao
Yongjian Wu
Feiyue Huang
Rongrong Ji
ViT
153
57
0
16 Apr 2022
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
188
24
0
29 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
European Signal Processing Conference (EUSIPCO), 2022
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
267
32
0
06 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
International Conference on Pattern Recognition (ICPR), 2022
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
194
37
0
21 Feb 2022
Joint Speech Recognition and Audio Captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2022
Chaitanya Narisetty
E. Tsunoo
Xuankai Chang
Yosuke Kashiwagi
Michael Hentschel
Shinji Watanabe
130
10
0
03 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
ACM Computing Surveys (ACM CSUR), 2022
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
480
150
0
31 Jan 2022
Local Information Assisted Attention-free Decoder for Audio Captioning
IEEE Signal Processing Letters (SPL), 2022
Feiyang Xiao
Jian Guan
Haiyan Lan
Qiaoxi Zhu
Wenwu Wang
270
13
0
10 Jan 2022
A Survey of Natural Language Generation
ACM Computing Surveys (CSUR), 2021
Chenhe Dong
Hai-Tao Zheng
Haifan Gong
Mengzhao Chen
Junxin Li
Ying Shen
Min Yang
3DV
336
63
0
22 Dec 2021
Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning
Benno Weck
Xavier Favory
Konstantinos Drossos
Xavier Serra
140
9
0
14 Oct 2021
Audio Captioning Using Sound Event Detection
Aycsegul Ozkaya Eren
M. Sert
168
8
0
04 Oct 2021
CIDEr-R: Robust Consensus-based Image Description Evaluation
G. O. D. Santos
Esther Luna Colombini
Sandra Avila
151
40
0
28 Sep 2021
Reinforcement Learning-powered Semantic Communication via Semantic Similarity
Kun Lu
Rongpeng Li
Xianfu Chen
Zhifeng Zhao
Honggang Zhang
157
57
0
27 Aug 2021
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2021
Guangyi Liu
Yinghong Liao
Fuyu Wang
Bin Zhang
Lu Zhang
...
Xiang Wan
Shaolin Li
Zhen Li
Shuixing Zhang
Shuguang Cui
274
73
0
11 Aug 2021
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021
Andrew Koh
Fuzhao Xue
Chng Eng Siong
129
22
0
10 Aug 2021
An Encoder-Decoder Based Audio Captioning System With Transfer and Reinforcement Learning
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
Xinhao Mei
Qiushi Huang
Xubo Liu
Gengyun Chen
Jingqian Wu
...
Tom Ko
H. Tang
Xingkun Shao
Mark D. Plumbley
Wenwu Wang
182
60
0
05 Aug 2021
Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
Jan van den Berg
Konstantinos Drossos
CLL
140
12
0
16 Jul 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
435
344
0
14 Jul 2021
Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation
North American Chapter of the Association for Computational Linguistics (NAACL), 2021
Guangyi Liu
Zichao Yang
Tianhua Tao
Xiaodan Liang
Junwei Bao
Zhen Li
Bowen Zhou
Shuguang Cui
Zhiting Hu
389
23
0
29 Jun 2021
SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis
Annual Meeting of the Association for Computational Linguistics (ACL), 2021
Joshua Forster Feinglass
Yezhou Yang
81
24
0
02 Jun 2021
Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Jia-Hong Huang
Ting-Wei Wu
Chao-Han Huck Yang
Marcel Worring
MedIm
160
33
0
30 May 2021
Contextualized Keyword Representations for Multi-modal Retinal Image Captioning
International Conference on Multimedia Retrieval (ICMR), 2021
Jia-Hong Huang
Ting-Wei Wu
Marcel Worring
MedIm
243
31
0
26 Apr 2021
MusCaps: Generating Captions for Music Audio
IEEE International Joint Conference on Neural Network (IJCNN), 2021
Ilaria Manco
Emmanouil Benetos
Elio Quinton
Gyorgy Fazekas
281
43
0
24 Apr 2021
Towards Accurate Text-based Image Captioning with Content Diversity Exploration
Computer Vision and Pattern Recognition (CVPR), 2021
Guanghui Xu
Shuaicheng Niu
Zhuliang Yu
Yucheng Luo
Qing Du
Qi Wu
DiffM
233
67
0
23 Apr 2021
Previous
1
2
3
4
5
Next