ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00370
  4. Cited By
Improved Image Captioning via Policy Gradient optimization of SPIDEr
v1v2v3v4 (latest)

Improved Image Captioning via Policy Gradient optimization of SPIDEr

1 December 2016
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
ArXiv (abs)PDFHTML

Papers citing "Improved Image Captioning via Policy Gradient optimization of SPIDEr"

50 / 232 papers shown
Enhanced Modality Transition for Image Captioning
Enhanced Modality Transition for Image Captioning
Ziwei Wang
Yadan Luo
Zi Huang
72
0
0
23 Feb 2021
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for
  Image Captioning
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2021
Jun Chen
Han Guo
Kai Yi
Boyang Albert Li
Mohamed Elhoseiny
VLM
430
274
0
20 Feb 2021
Image Captioning using Multiple Transformers for Self-Attention
  Mechanism
Image Captioning using Multiple Transformers for Self-Attention Mechanism
Farrukh Olimov
Shikha Dubey
Labina Shrestha
Tran Trung Tin
M. Jeon
ViT
97
3
0
14 Feb 2021
The Role of Syntactic Planning in Compositional Image Captioning
The Role of Syntactic Planning in Compositional Image CaptioningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021
Emanuele Bugliarello
Desmond Elliott
CoGe
102
15
0
28 Jan 2021
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
175
74
0
31 Dec 2020
WEmbSim: A Simple yet Effective Metric for Image Captioning
WEmbSim: A Simple yet Effective Metric for Image CaptioningInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2020
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
104
2
0
24 Dec 2020
LCEval: Learned Composite Metric for Caption Evaluation
LCEval: Learned Composite Metric for Caption EvaluationInternational Journal of Computer Vision (IJCV), 2019
Naeha Sharif
Lyndon White
Bennamoun
Wei Liu
Syed Afaq Ali Shah
120
8
0
24 Dec 2020
Image Captioning with Context-Aware Auxiliary Guidance
Image Captioning with Context-Aware Auxiliary GuidanceAAAI Conference on Artificial Intelligence (AAAI), 2020
Zeliang Song
Xiaofei Zhou
Zhendong Mao
Jianlong Tan
205
34
0
10 Dec 2020
Understanding Guided Image Captioning Performance across Domains
Understanding Guided Image Captioning Performance across DomainsConference on Computational Natural Language Learning (CoNLL), 2020
Edwin G. Ng
Bo Pang
P. Sharma
Radu Soricut
369
28
0
04 Dec 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
DORB: Dynamically Optimizing Multiple Rewards with BanditsConference on Empirical Methods in Natural Language Processing (EMNLP), 2020
Ramakanth Pasunuru
Han Guo
Joey Tianyi Zhou
OffRL
189
8
0
15 Nov 2020
Dual Attention on Pyramid Feature Maps for Image Captioning
Dual Attention on Pyramid Feature Maps for Image CaptioningIEEE transactions on multimedia (TMM), 2020
Litao Yu
Jian Zhang
Qiang Wu
335
58
0
02 Nov 2020
Boost Image Captioning with Knowledge Reasoning
Boost Image Captioning with Knowledge ReasoningMachine-mediated learning (ML), 2020
Feicheng Huang
Zhiwen Wang
Haiyang Wei
Canlong Zhang
Huifang Ma
115
27
0
02 Nov 2020
DeepOpht: Medical Report Generation for Retinal Images via Deep Models
  and Visual Explanation
DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual ExplanationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
Jia-Hong Huang
Chao-Han Huck Yang
Fangyu Liu
Meng Tian
Yi-Chieh Liu
...
Kang Wang
Hiromasa Morikawa
Hernghua Chang
Jesper N. Tegnér
M. Worring
MedIm
206
65
0
01 Nov 2020
WaveTransformer: A Novel Architecture for Audio Captioning Based on
  Learning Temporal and Time-Frequency Information
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information
An Tran
Konstantinos Drossos
Maria Sandsten
197
19
0
21 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A ReviewNeurocomputing (Neurocomputing), 2020
Wei Chen
Weiping Wang
Tianpeng Liu
M. Lew
VLM
329
36
0
16 Oct 2020
Adversarial Grammatical Error Correction
Adversarial Grammatical Error CorrectionFindings (Findings), 2020
Vipul Raheja
Dimitrios Alikaniotis
148
11
0
06 Oct 2020
Teacher-Critical Training Strategies for Image Captioning
Teacher-Critical Training Strategies for Image Captioning
Yiqing Huang
Jiansheng Chen
VLM
147
9
0
30 Sep 2020
Where is the Model Looking At?--Concentrate and Explain the Network
  Attention
Where is the Model Looking At?--Concentrate and Explain the Network Attention
Wenjia Xu
Jiuniu Wang
Yang Wang
Guangluan Xu
Wei Dai
Yirong Wu
XAI
123
17
0
29 Sep 2020
Effects of Word-frequency based Pre- and Post- Processings for Audio
  Captioning
Effects of Word-frequency based Pre- and Post- Processings for Audio CaptioningWorkshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020
Daiki Takeuchi
Yuma Koizumi
Yasunori Ohishi
Noboru Harada
K. Kashino
171
28
0
24 Sep 2020
Towards Unique and Informative Captioning of Images
Towards Unique and Informative Captioning of ImagesEuropean Conference on Computer Vision (ECCV), 2020
Zeyu Wang
Berthy Feng
Karthik Narasimhan
Olga Russakovsky
168
38
0
08 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems
A Survey of Evaluation Metrics Used for NLG SystemsACM Computing Surveys (ACM CSUR), 2020
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
446
288
0
27 Aug 2020
Learn to Talk via Proactive Knowledge Transfer
Learn to Talk via Proactive Knowledge Transfer
Qing Sun
James Cross
98
0
0
23 Aug 2020
Assisting Scene Graph Generation with Self-Supervision
Assisting Scene Graph Generation with Self-Supervision
Sandeep Inuganti
V. Balasubramanian
SSL
177
7
0
08 Aug 2020
Neural Language Generation: Formulation, Methods, and Evaluation
Neural Language Generation: Formulation, Methods, and Evaluation
Cristina Garbacea
Qiaozhu Mei
353
29
0
31 Jul 2020
A Unified Framework of Surrogate Loss by Refactoring and Interpolation
A Unified Framework of Surrogate Loss by Refactoring and InterpolationEuropean Conference on Computer Vision (ECCV), 2020
Lanlan Liu
Mingzhe Wang
Gaowen Liu
154
9
0
27 Jul 2020
Multi-task Regularization Based on Infrequent Classes for Audio
  Captioning
Multi-task Regularization Based on Infrequent Classes for Audio CaptioningWorkshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020
Emre Çakir
Konstantinos Drossos
Maria Sandsten
108
17
0
09 Jul 2020
Temporal Sub-sampling of Audio Feature Sequences for Automated Audio
  Captioning
Temporal Sub-sampling of Audio Feature Sequences for Automated Audio Captioning
K. Nguyen
Konstantinos Drossos
Maria Sandsten
106
12
0
06 Jul 2020
Listen carefully and tell: an audio captioning system based on residual
  learning and gammatone audio representation
Listen carefully and tell: an audio captioning system based on residual learning and gammatone audio representation
Sergi Perez-Castanos
Javier Naranjo-Alcazar
P. Zuccarello
M. Cobos
157
12
0
27 Jun 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELMLM&MA
327
420
0
26 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
  Generation
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li
Fuyu Wang
Xiaojun Chang
Xiaodan Liang
MedIm
201
132
0
06 Jun 2020
MLE-guided parameter search for task loss minimization in neural
  sequence modeling
MLE-guided parameter search for task loss minimization in neural sequence modeling
Sean Welleck
Dong Wang
193
9
0
04 Jun 2020
Chat as Expected: Learning to Manipulate Black-box Neural Dialogue
  Models
Chat as Expected: Learning to Manipulate Black-box Neural Dialogue Models
Haochen Liu
Zhiwei Wang
Hanyu Wang
Shucheng Zhou
AAML
169
15
0
27 May 2020
ALBA : Reinforcement Learning for Video Object Segmentation
ALBA : Reinforcement Learning for Video Object SegmentationBritish Machine Vision Conference (BMVC), 2020
Shreyank N. Gowda
Panagiotis Eustratiadis
Timothy M. Hospedales
Laura Sevilla-Lara
VOS
173
11
0
26 May 2020
Learning a Reinforced Agent for Flexible Exposure Bracketing Selection
Learning a Reinforced Agent for Flexible Exposure Bracketing SelectionComputer Vision and Pattern Recognition (CVPR), 2020
Zhouxia Wang
Jiawei Zhang
Mude Lin
Zhenghao Hu
Ping Luo
Jimmy S. J. Ren
147
19
0
26 May 2020
Self-Supervised and Controlled Multi-Document Opinion Summarization
Self-Supervised and Controlled Multi-Document Opinion SummarizationConference of the European Chapter of the Association for Computational Linguistics (EACL), 2020
Hady ElSahar
Maximin Coavoux
Matthias Gallé
Jos Rozen
126
51
0
30 Apr 2020
Show, Describe and Conclude: On Exploiting the Structure Information of
  Chest X-Ray Reports
Show, Describe and Conclude: On Exploiting the Structure Information of Chest X-Ray ReportsAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Baoyu Jing
Zeya Wang
Eric Xing
268
167
0
26 Apr 2020
Context-Aware Group Captioning via Self-Attention and Contrastive
  Features
Context-Aware Group Captioning via Self-Attention and Contrastive FeaturesComputer Vision and Pattern Recognition (CVPR), 2020
Zhuowan Li
Quan Hung Tran
Long Mai
Zhe Lin
Alan Yuille
VLM
168
50
0
07 Apr 2020
Learning Compact Reward for Image Captioning
Learning Compact Reward for Image Captioning
Nannan Li
Zhenzhong Chen
149
3
0
24 Mar 2020
Visual Question Answering for Cultural Heritage
Visual Question Answering for Cultural Heritage
P. Bongini
Federico Becattini
Andrew D. Bagdanov
Marco Bertini
841
29
0
22 Mar 2020
Say As You Wish: Fine-grained Control of Image Caption Generation with
  Abstract Scene Graphs
Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene GraphsComputer Vision and Pattern Recognition (CVPR), 2020
Shizhe Chen
Qin Jin
Peng Wang
Qi Wu
DiffM
327
238
0
01 Mar 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
226
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)Multimedia tools and applications (MTA), 2020
C. Sur
159
17
0
15 Feb 2020
aiTPR: Attribute Interaction-Tensor Product Representation for Image
  Caption
aiTPR: Attribute Interaction-Tensor Product Representation for Image CaptionNeural Processing Letters (NPL), 2020
C. Sur
119
10
0
27 Jan 2020
Nested-Wasserstein Self-Imitation Learning for Sequence Generation
Nested-Wasserstein Self-Imitation Learning for Sequence GenerationInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2020
Ruiyi Zhang
Changyou Chen
Zhe Gan
Zheng Wen
Wenlin Wang
Lawrence Carin
204
7
0
20 Jan 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video CaptioningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
117
22
0
17 Jan 2020
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence
  Generation
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence GenerationInternational Conference on Learning Representations (ICLR), 2019
Xinjie Fan
Yizhe Zhang
Zhendong Wang
Mingyuan Zhou
BDL
139
4
0
31 Dec 2019
Vision and Language: from Visual Perception to Content Creation
Vision and Language: from Visual Perception to Content CreationAPSIPA Transactions on Signal and Information Processing (APSIPA TSIP), 2019
Tao Mei
Wei Zhang
Ting Yao
VLM
182
8
0
26 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2019
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
262
1,025
0
17 Dec 2019
Learning to Relate from Captions and Bounding Boxes
Learning to Relate from Captions and Bounding BoxesAnnual Meeting of the Association for Computational Linguistics (ACL), 2019
Sarthak Garg
Joel Ruben Antony Moniz
Anshu Aviral
Priyatham Bollimpalli
161
4
0
01 Dec 2019
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and
  Context Capture for Language Representation -- A Generalization of Bi
  Directional LSTM
CRUR: Coupled-Recurrent Unit for Unification, Conceptualization and Context Capture for Language Representation -- A Generalization of Bi Directional LSTMMultimedia tools and applications (MTA), 2019
C. Sur
BDL
188
6
0
22 Nov 2019
Previous
12345
Next