ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.06647
  4. Cited By
Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning
  Challenge

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

21 September 2016
Oriol Vinyals
Alexander Toshev
Samy Bengio
D. Erhan
ArXivPDFHTML

Papers citing "Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge"

50 / 66 papers shown
Title
ChatBEV: A Visual Language Model that Understands BEV Maps
ChatBEV: A Visual Language Model that Understands BEV Maps
Qingyao Xu
S. Chen
Guang Chen
Yanfeng Wang
Y. Zhang
46
0
0
18 Mar 2025
Progress-Aware Video Frame Captioning
Progress-Aware Video Frame Captioning
Zihui Xue
Joungbin An
Xitong Yang
Kristen Grauman
100
1
0
03 Dec 2024
Evaluating Pragmatic Abilities of Image Captioners on A3DS
Evaluating Pragmatic Abilities of Image Captioners on A3DS
Polina Tsvilodub
Michael Franke
EGVM
17
3
0
22 May 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image Captioning
R. Ramos
Desmond Elliott
Bruno Martins
VLM
24
29
0
16 Feb 2023
On The Coherence of Quantitative Evaluation of Visual Explanations
On The Coherence of Quantitative Evaluation of Visual Explanations
Benjamin Vandersmissen
José Oramas
XAI
FAtt
26
3
0
14 Feb 2023
An Image captioning algorithm based on the Hybrid Deep Learning
  Technique (CNN+GRU)
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU)
Rana Adnan Ahmad
Muhammad Azhar
Hina Sattar
21
10
0
06 Jan 2023
SLAM for Visually Impaired People: a Survey
SLAM for Visually Impaired People: a Survey
Banafshe Marziyeh Bamdad
Davide Scaramuzza
Alireza Darvishy
10
8
0
09 Dec 2022
Progressive Tree-Structured Prototype Network for End-to-End Image
  Captioning
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning
Pengpeng Zeng
Jinkuan Zhu
Jingkuan Song
Lianli Gao
VLM
22
27
0
17 Nov 2022
FSHMEM: Supporting Partitioned Global Address Space on FPGAs for
  Large-Scale Hardware Acceleration Infrastructure
FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure
Y. F. Arthanto
David Ojika
Joo-Young Kim
FedML
56
2
0
11 Jul 2022
Image Captioning based on Feature Refinement and Reflective Decoding
Image Captioning based on Feature Refinement and Reflective Decoding
G. Alabduljabbar
Hafida Benhidour
Said Kerrache
3DV
14
3
0
16 Jun 2022
Prompt-based Learning for Unpaired Image Captioning
Prompt-based Learning for Unpaired Image Captioning
Peipei Zhu
Xiao Wang
Lin Zhu
Zhenglong Sun
Weishi Zheng
Yaowei Wang
C. L. P. Chen
VLM
21
31
0
26 May 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote
  Sensing Image Retrieval
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval
Zhiqiang Yuan
Wenkai Zhang
Kun Fu
Xuan Li
Chubo Deng
Hongqi Wang
Xian Sun
19
129
0
21 Apr 2022
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control
Simone Parisi
Aravind Rajeswaran
Senthil Purushwalkam
Abhinav Gupta
LM&Ro
26
186
0
07 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual
  Concept Recognition
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition
Peipei Zhu
Xiao Wang
Yong Luo
Zhenglong Sun
Wei-Shi Zheng
Yaowei Wang
C. L. P. Chen
22
12
0
07 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning
CaMEL: Mean Teacher Learning for Image Captioning
Manuele Barraco
Matteo Stefanini
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
ViT
VLM
25
27
0
21 Feb 2022
Introducing the DOME Activation Functions
Introducing the DOME Activation Functions
Mohamed E. Hussein
Wael AbdAlmageed
25
1
0
30 Sep 2021
Caption Enriched Samples for Improving Hateful Memes Detection
Caption Enriched Samples for Improving Hateful Memes Detection
Efrat Blaier
Itzik Malkiel
Lior Wolf
VLM
51
21
0
22 Sep 2021
Dual Graph Convolutional Networks with Transformer and Curriculum
  Learning for Image Captioning
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning
Xinzhi Dong
Chengjiang Long
Wenju Xu
Chunxia Xiao
ViT
69
66
0
05 Aug 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning
Paul Pu Liang
Yiwei Lyu
Xiang Fan
Zetian Wu
Yun Cheng
...
Peter Wu
Michelle A. Lee
Yuke Zhu
Ruslan Salakhutdinov
Louis-Philippe Morency
VLM
23
158
0
15 Jul 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
13
1,436
0
18 Apr 2021
Visual Goal-Step Inference using wikiHow
Visual Goal-Step Inference using wikiHow
Yue Yang
Artemis Panagopoulou
Qing Lyu
Li Zhang
Mark Yatskar
Chris Callison-Burch
29
41
0
12 Apr 2021
Diagnostic Captioning: A Survey
Diagnostic Captioning: A Survey
John Pavlopoulos
Vasiliki Kougia
Ion Androutsopoulos
D. Papamichail
3DV
MedIm
89
26
0
18 Jan 2021
Towards Overcoming False Positives in Visual Relationship Detection
Towards Overcoming False Positives in Visual Relationship Detection
Daisheng Jin
Xiao Ma
Chongzhi Zhang
Yizhuo Zhou
Jiashu Tao
...
Haiyu Zhao
Shuai Yi
Zhoujun Li
Xianglong Liu
Hongsheng Li
17
5
0
23 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search
AutoCaption: Image Captioning with Neural Architecture Search
Xinxin Zhu
Weining Wang
Longteng Guo
Jing Liu
24
9
0
16 Dec 2020
Improving Image Captioning by Leveraging Intra- and Inter-layer Global
  Representation in Transformer Network
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network
Jiayi Ji
Yunpeng Luo
Xiaoshuai Sun
Fuhai Chen
Gen Luo
Yongjian Wu
Yue Gao
Rongrong Ji
ViT
41
170
0
13 Dec 2020
Curious Case of Language Generation Evaluation Metrics: A Cautionary
  Tale
Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale
Ozan Caglayan
Pranava Madhyastha
Lucia Specia
ELM
37
35
0
26 Oct 2020
Towards Unique and Informative Captioning of Images
Towards Unique and Informative Captioning of Images
Zeyu Wang
Berthy T. Feng
Karthik Narasimhan
Olga Russakovsky
17
37
0
08 Sep 2020
Neural Learning of One-of-Many Solutions for Combinatorial Problems in
  Structured Output Spaces
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces
Yatin Nandwani
Deepanshu Jindal
Mausam
Parag Singla
16
13
0
27 Aug 2020
Explore and Explain: Self-supervised Navigation and Recounting
Explore and Explain: Self-supervised Navigation and Recounting
Roberto Bigazzi
Federico Landi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
EgoV
LM&Ro
8
17
0
14 Jul 2020
Non-Autoregressive Image Captioning with Counterfactuals-Critical
  Multi-Agent Learning
Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning
Longteng Guo
Jing Liu
Xinxin Zhu
Xingjian He
Jie Jiang
Hanqing Lu
BDL
14
56
0
10 May 2020
AIBench Scenario: Scenario-distilling AI Benchmarking
AIBench Scenario: Scenario-distilling AI Benchmarking
Wanling Gao
Fei Tang
Jianfeng Zhan
Xu Wen
Lei Wang
Zheng Cao
Chuanxin Lan
Chunjie Luo
Xiaoli Liu
Zihan Jiang
21
14
0
06 May 2020
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
AIBench Training: Balanced Industry-Standard AI Training Benchmarking
Fei Tang
Wanling Gao
Jianfeng Zhan
Chuanxin Lan
Xu Wen
...
Yatao Li
Junchao Shao
Zhenyu Wang
Xiaoyu Wang
Hainan Ye
25
3
0
30 Apr 2020
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
114
189
0
19 Mar 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image
  Captioning With R-CNN Feature Distribution Composition (FDC)
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC)
C. Sur
23
16
0
15 Feb 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features
  from Social Network
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network
W. Ramos
M. Silva
Edson Roteia Araujo Junior
Alan C. Neves
Erickson R. Nascimento
14
6
0
29 Dec 2019
Meshed-Memory Transformer for Image Captioning
Meshed-Memory Transformer for Image Captioning
Marcella Cornia
Matteo Stefanini
Lorenzo Baraldi
Rita Cucchiara
14
868
0
17 Dec 2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via
  Iterative Multi-agent Communication
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication
Ruize Wang
Zhongyu Wei
Ying Cheng
Piji Li
Haijun Shan
Ji Zhang
Qi Zhang
Xuanjing Huang
VGen
DiffM
15
13
0
11 Nov 2019
Semantic Object Accuracy for Generative Text-to-Image Synthesis
Semantic Object Accuracy for Generative Text-to-Image Synthesis
Tobias Hinz
Stefan Heinrich
S. Wermter
EGVM
24
158
0
29 Oct 2019
Compositional Generalization in Image Captioning
Compositional Generalization in Image Captioning
Mitja Nikolaus
Mostafa Abdou
Matthew Lamm
Rahul Aralikatte
Desmond Elliott
CoGe
21
49
0
10 Sep 2019
Aligning Linguistic Words and Visual Semantic Units for Image Captioning
Aligning Linguistic Words and Visual Semantic Units for Image Captioning
Longteng Guo
Jing Liu
Jinhui Tang
Jiangwei Li
W. Luo
Hanqing Lu
14
102
0
06 Aug 2019
Physical Cue based Depth-Sensing by Color Coding with Deaberration
  Network
Physical Cue based Depth-Sensing by Color Coding with Deaberration Network
Nao Mishima
Tatsuo Kozakaya
Akihisa Moriya
R. Okada
S. Hiura
3DV
26
3
0
01 Aug 2019
Predicting Motion of Vulnerable Road Users using High-Definition Maps
  and Efficient ConvNets
Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets
Fang-Chieh Chou
Tsung-Han Lin
Henggang Cui
Vladan Radosavljevic
Thi Nguyen
Tzu-Kuo Huang
Matthew Niedoba
J. Schneider
Nemanja Djuric
8
58
0
20 Jun 2019
End-to-End Video Captioning
End-to-End Video Captioning
Silvio Olivastri
Gurkirt Singh
Fabio Cuzzolin
16
18
0
04 Apr 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Visual Entailment: A Novel Task for Fine-Grained Image Understanding
Ning Xie
Farley Lai
Derek Doran
Asim Kadav
CoGe
31
321
0
20 Jan 2019
Pre-gen metrics: Predicting caption quality metrics without generating
  captions
Pre-gen metrics: Predicting caption quality metrics without generating captions
Marc Tanti
Albert Gatt
K. Camilleri
14
2
0
12 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
A Comprehensive Survey of Deep Learning for Image Captioning
Md. Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
28
760
0
06 Oct 2018
Context-Dependent Diffusion Network for Visual Relationship Detection
Context-Dependent Diffusion Network for Visual Relationship Detection
Zhen Cui
Chunyan Xu
Wenming Zheng
Jian Yang
GNN
12
50
0
11 Sep 2018
LUCSS: Language-based User-customized Colourization of Scene Sketches
LUCSS: Language-based User-customized Colourization of Scene Sketches
C. Zou
Haoran Mo
Ruofei Du
Xing Wu
Chengying Gao
Hongbo Fu
22
8
0
30 Aug 2018
Exploring the Applications of Faster R-CNN and Single-Shot Multi-box
  Detection in a Smart Nursery Domain
Exploring the Applications of Faster R-CNN and Single-Shot Multi-box Detection in a Smart Nursery Domain
S. Phon-Amnuaisuk
K. Murata
P. Pavarangkoon
Kazunori Yamamoto
Takamichi Mizuhara
ObjD
11
11
0
27 Aug 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship
  Features
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features
Xu Yang
Hanwang Zhang
Jianfei Cai
42
74
0
01 Aug 2018
12
Next