ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1612.00370
  4. Cited By
Improved Image Captioning via Policy Gradient optimization of SPIDEr

Improved Image Captioning via Policy Gradient optimization of SPIDEr

1 December 2016
Siqi Liu
Zhenhai Zhu
Ning Ye
S. Guadarrama
Kevin Patrick Murphy
ArXivPDFHTML

Papers citing "Improved Image Captioning via Policy Gradient optimization of SPIDEr"

50 / 67 papers shown
Title
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Ask in Any Modality: A Comprehensive Survey on Multimodal Retrieval-Augmented Generation
Mohammad Mahdi Abootorabi
Amirhosein Zobeiri
Mahdi Dehghani
Mohammadali Mohammadkhani
Bardia Mohammadi
Omid Ghahroodi
M. Baghshah
Ehsaneddin Asgari
RALM
100
4
0
12 Feb 2025
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent
  Approach
Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach
Rory Young
Nicolas Pugeault
AAML
57
0
0
14 Oct 2024
An Eye for an Ear: Zero-shot Audio Description Leveraging an Image
  Captioner using Audiovisual Distribution Alignment
An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment
Hugo Malard
Michel Olvera
Stéphane Lathuilière
S. Essid
VLM
34
0
0
08 Oct 2024
Fine-grained Audio-Visual Joint Representations for Multimodal Large
  Language Models
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models
Guangzhi Sun
Wenyi Yu
Changli Tang
Xianzhao Chen
Tian Tan
Wei Li
Lu Lu
Zejun Ma
Chao Zhang
28
12
0
09 Oct 2023
Audio Difference Captioning Utilizing Similarity-Discrepancy
  Disentanglement
Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
17
6
0
23 Aug 2023
A request for clarity over the End of Sequence token in the
  Self-Critical Sequence Training
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
26
6
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
21
5
0
20 May 2023
Semantics-Empowered Communication: A Tutorial-cum-Survey
Semantics-Empowered Communication: A Tutorial-cum-Survey
Zhilin Lu
Rongpeng Li
Kun Lu
Xianfu Chen
E. Hossain
Zhifeng Zhao
Honggang Zhang
31
19
0
16 Dec 2022
Automated Audio Captioning via Fusion of Low- and High- Dimensional
  Features
Automated Audio Captioning via Fusion of Low- and High- Dimensional Features
Jianyuan Sun
Xubo Liu
Xinhao Mei
Mark D. Plumbley
V. Kılıç
Wenwu Wang
25
3
0
10 Oct 2022
Is Reinforcement Learning (Not) for Natural Language Processing:
  Benchmarks, Baselines, and Building Blocks for Natural Language Policy
  Optimization
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Rajkumar Ramamurthy
Prithviraj Ammanabrolu
Kianté Brantley
Jack Hessel
R. Sifa
Christian Bauckhage
Hannaneh Hajishirzi
Yejin Choi
OffRL
31
239
0
03 Oct 2022
Paraphrasing Is All You Need for Novel Object Captioning
Paraphrasing Is All You Need for Novel Object Captioning
Cheng Yang
Yao-Hung Hubert Tsai
Wanshu Fan
Ruslan Salakhutdinov
Louis-Philippe Morency
Yu-Chiang Frank Wang
36
4
0
25 Sep 2022
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning
  in Wikipedia
Show, Interpret and Tell: Entity-aware Contextualised Image Captioning in Wikipedia
K. Nguyen
Ali Furkan Biten
Andrés Mafla
Lluís Gómez
Dimosthenis Karatzas
28
10
0
21 Sep 2022
An investigation on selecting audio pre-trained models for audio
  captioning
An investigation on selecting audio pre-trained models for audio captioning
Peiran Yan
Sheng-Wei Li
21
0
0
12 Aug 2022
Automated Audio Captioning: An Overview of Recent Progress and New
  Challenges
Automated Audio Captioning: An Overview of Recent Progress and New Challenges
Xinhao Mei
Xubo Liu
Mark D. Plumbley
Wenwu Wang
27
37
0
12 May 2022
Interactive Audio-text Representation for Automated Audio Captioning
  with Contrastive Learning
Interactive Audio-text Representation for Automated Audio Captioning with Contrastive Learning
Chen Chen
Nana Hou
Yuchen Hu
Heqing Zou
Xiaofeng Qi
Chng Eng Siong
VLM
26
21
0
29 Mar 2022
Leveraging Pre-trained BERT for Audio Captioning
Leveraging Pre-trained BERT for Audio Captioning
Xubo Liu
Xinhao Mei
Qiushi Huang
Jianyuan Sun
Jinzheng Zhao
Haohe Liu
Mark D. Plumbley
Volkan Kilicc
Wenwu Wang
25
29
0
06 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
Deep Learning Approaches on Image Captioning: A Review
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi
H. Pourreza
H. Mahyar
VLM
10
89
0
31 Jan 2022
A Survey of Natural Language Generation
A Survey of Natural Language Generation
Chenhe Dong
Yinghui Li
Haifan Gong
M. Chen
Junxin Li
Ying Shen
Min Yang
3DV
24
43
0
22 Dec 2021
Audio Captioning Using Sound Event Detection
Audio Captioning Using Sound Event Detection
Aycsegul Ozkaya Eren
M. Sert
35
8
0
04 Oct 2021
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report
  Generation With Alternate Learning
Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning
Guangyi Liu
Yinghong Liao
Fuyu Wang
Bin Zhang
Lu Zhang
...
Xiang Wan
Shaolin Li
Zhen Li
Shuixing Zhang
Shuguang Cui
13
56
0
11 Aug 2021
Automated Audio Captioning using Transfer Learning and Reconstruction
  Latent Space Similarity Regularization
Automated Audio Captioning using Transfer Learning and Reconstruction Latent Space Similarity Regularization
Andrew Koh
Fuzhao Xue
Chng Eng Siong
14
20
0
10 Aug 2021
From Show to Tell: A Survey on Deep Learning-based Image Captioning
From Show to Tell: A Survey on Deep Learning-based Image Captioning
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
S. Cascianelli
G. Fiameni
Rita Cucchiara
3DV
VLM
MLLM
58
254
0
14 Jul 2021
Longer Version for "Deep Context-Encoding Network for Retinal Image
  Captioning"
Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Jia-Hong Huang
Ting-Wei Wu
Chao-Han Huck Yang
M. Worring
MedIm
17
28
0
30 May 2021
Image Captioning using Multiple Transformers for Self-Attention
  Mechanism
Image Captioning using Multiple Transformers for Self-Attention Mechanism
Farrukh Olimov
Shikha Dubey
Labina Shrestha
Tran Trung Tin
M. Jeon
ViT
26
2
0
14 Feb 2021
DORB: Dynamically Optimizing Multiple Rewards with Bandits
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Mohit Bansal
OffRL
30
6
0
15 Nov 2020
Dual Attention on Pyramid Feature Maps for Image Captioning
Dual Attention on Pyramid Feature Maps for Image Captioning
Litao Yu
Jian Andrew Zhang
Qiang Wu
16
47
0
02 Nov 2020
WaveTransformer: A Novel Architecture for Audio Captioning Based on
  Learning Temporal and Time-Frequency Information
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information
An Tran
K. Drossos
Tuomas Virtanen
28
19
0
21 Oct 2020
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
New Ideas and Trends in Deep Multimodal Content Understanding: A Review
Wei-Neng Chen
Weiping Wang
Li Liu
M. Lew
VLM
112
31
0
16 Oct 2020
Teacher-Critical Training Strategies for Image Captioning
Teacher-Critical Training Strategies for Image Captioning
Yiqing Huang
Jiansheng Chen
VLM
16
8
0
30 Sep 2020
Towards Unique and Informative Captioning of Images
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy T. Feng
Karthik Narasimhan
Olga Russakovsky
17
37
0
08 Sep 2020
A Survey of Evaluation Metrics Used for NLG Systems
A Survey of Evaluation Metrics Used for NLG Systems
Ananya B. Sai
Akash Kumar Mohankumar
Mitesh M. Khapra
ELM
25
228
0
27 Aug 2020
Assisting Scene Graph Generation with Self-Supervision
Assisting Scene Graph Generation with Self-Supervision
Sandeep Inuganti
V. Balasubramanian
SSL
11
7
0
08 Aug 2020
A Unified Framework of Surrogate Loss by Refactoring and Interpolation
A Unified Framework of Surrogate Loss by Refactoring and Interpolation
Lanlan Liu
Mingzhe Wang
Jia Deng
19
8
0
27 Jul 2020
Evaluation of Text Generation: A Survey
Evaluation of Text Generation: A Survey
Asli Celikyilmaz
Elizabeth Clark
Jianfeng Gao
ELM
LM&MA
19
376
0
26 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
  Generation
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li
Fuyu Wang
Xiaojun Chang
Xiaodan Liang
MedIm
21
101
0
06 Jun 2020
Chat as Expected: Learning to Manipulate Black-box Neural Dialogue
  Models
Chat as Expected: Learning to Manipulate Black-box Neural Dialogue Models
Haochen Liu
Zhiwei Wang
Tyler Derr
Jiliang Tang
AAML
14
15
0
27 May 2020
Visual Question Answering for Cultural Heritage
Visual Question Answering for Cultural Heritage
P. Bongini
Federico Becattini
Andrew D. Bagdanov
A. Bimbo
170
22
0
22 Mar 2020
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic
  Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO
  Framework
Gaussian Smoothen Semantic Features (GSSF) -- Exploring the Linguistic Aspects of Visual Captioning in Indian Languages (Bengali) Using MSCOCO Framework
C. Sur
13
7
0
16 Feb 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
25
16
0
15 Feb 2020
Spatio-Temporal Ranked-Attention Networks for Video Captioning
Spatio-Temporal Ranked-Attention Networks for Video Captioning
A. Cherian
Jue Wang
Chiori Hori
Tim K. Marks
AI4TS
20
19
0
17 Jan 2020
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
868
0
17 Dec 2019
Learning Visual Relation Priors for Image-Text Matching and Image
  Captioning with Neural Scene Graph Generators
Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Kuang-Huei Lee
Hamid Palangi
Xi Chen
Houdong Hu
Jianfeng Gao
VLM
19
37
0
22 Sep 2019
Neural Keyphrase Generation via Reinforcement Learning with Adaptive
  Rewards
Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards
Hou Pong Chan
Wang Chen
Lu Wang
Irwin King
11
81
0
10 Jun 2019
Interactive Image Generation Using Scene Graphs
Interactive Image Generation Using Scene Graphs
Gaurav Mittal
Shubham Agrawal
Anuva Agarwal
Sushant Mehta
Tanya Marwah
GNN
22
53
0
09 May 2019
Describing like humans: on diversity in image captioning
Describing like humans: on diversity in image captioning
Qingzhong Wang
Antoni B. Chan
19
97
0
28 Mar 2019
Boosted Attention: Leveraging Human Attention for Image Captioning
Boosted Attention: Leveraging Human Attention for Image Captioning
Shi Chen
Qi Zhao
16
47
0
18 Mar 2019
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Counterfactual Critic Multi-Agent Training for Scene Graph Generation
Long Chen
Hanwang Zhang
Jun Xiao
Xiangnan He
Shiliang Pu
Shih-Fu Chang
14
159
0
06 Dec 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
28
760
0
06 Oct 2018
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and
  Comprehensive Image Captions
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Fenglin Liu
Xuancheng Ren
Yuanxin Liu
Houfeng Wang
Xu Sun
95
65
0
27 Aug 2018
12
Next