Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

21 September 2016

Papers citing "Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge"

50 / 66 papers shown

Title
ChatBEV: A Visual Language Model that Understands BEV Maps Qingyao Xu S. Chen Guang Chen Yanfeng Wang Y. Zhang 46 0 0 18 Mar 2025
Progress-Aware Video Frame Captioning Zihui Xue Joungbin An Xitong Yang Kristen Grauman 100 1 0 03 Dec 2024
Evaluating Pragmatic Abilities of Image Captioners on A3DS Polina Tsvilodub Michael Franke EGVM 17 3 0 22 May 2023
Retrieval-augmented Image Captioning R. Ramos Desmond Elliott Bruno Martins VLM 24 29 0 16 Feb 2023
On The Coherence of Quantitative Evaluation of Visual Explanations Benjamin Vandersmissen José Oramas XAI FAtt 26 3 0 14 Feb 2023
An Image captioning algorithm based on the Hybrid Deep Learning Technique (CNN+GRU) Rana Adnan Ahmad Muhammad Azhar Hina Sattar 21 10 0 06 Jan 2023
SLAM for Visually Impaired People: a Survey Banafshe Marziyeh Bamdad Davide Scaramuzza Alireza Darvishy 10 8 0 09 Dec 2022
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning Pengpeng Zeng Jinkuan Zhu Jingkuan Song Lianli Gao VLM 22 27 0 17 Nov 2022
FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure Y. F. Arthanto David Ojika Joo-Young Kim FedML 56 2 0 11 Jul 2022
Image Captioning based on Feature Refinement and Reflective Decoding G. Alabduljabbar Hafida Benhidour Said Kerrache 3DV 14 3 0 16 Jun 2022
Prompt-based Learning for Unpaired Image Captioning Peipei Zhu Xiao Wang Lin Zhu Zhenglong Sun Weishi Zheng Yaowei Wang C. L. P. Chen VLM 21 31 0 26 May 2022
Exploring a Fine-Grained Multiscale Method for Cross-Modal Remote Sensing Image Retrieval Zhiqiang Yuan Wenkai Zhang Kun Fu Xuan Li Chubo Deng Hongqi Wang Xian Sun 19 129 0 21 Apr 2022
The Unsurprising Effectiveness of Pre-Trained Vision Models for Control Simone Parisi Aravind Rajeswaran Senthil Purushwalkam Abhinav Gupta LM&Ro 26 186 0 07 Mar 2022
Unpaired Image Captioning by Image-level Weakly-Supervised Visual Concept Recognition Peipei Zhu Xiao Wang Yong Luo Zhenglong Sun Wei-Shi Zheng Yaowei Wang C. L. P. Chen 22 12 0 07 Mar 2022
CaMEL: Mean Teacher Learning for Image Captioning Manuele Barraco Matteo Stefanini Marcella Cornia S. Cascianelli Lorenzo Baraldi Rita Cucchiara ViT VLM 25 27 0 21 Feb 2022
Introducing the DOME Activation Functions Mohamed E. Hussein Wael AbdAlmageed 25 1 0 30 Sep 2021
Caption Enriched Samples for Improving Hateful Memes Detection Efrat Blaier Itzik Malkiel Lior Wolf VLM 51 21 0 22 Sep 2021
Dual Graph Convolutional Networks with Transformer and Curriculum Learning for Image Captioning Xinzhi Dong Chengjiang Long Wenju Xu Chunxia Xiao ViT 69 66 0 05 Aug 2021
MultiBench: Multiscale Benchmarks for Multimodal Representation Learning Paul Pu Liang Yiwei Lyu Xiang Fan Zetian Wu Yun Cheng ... Peter Wu Michelle A. Lee Yuke Zhu Ruslan Salakhutdinov Louis-Philippe Morency VLM 23 158 0 15 Jul 2021
CLIPScore: A Reference-free Evaluation Metric for Image Captioning Jack Hessel Ari Holtzman Maxwell Forbes Ronan Le Bras Yejin Choi CLIP 13 1,436 0 18 Apr 2021
Visual Goal-Step Inference using wikiHow Yue Yang Artemis Panagopoulou Qing Lyu Li Zhang Mark Yatskar Chris Callison-Burch 29 41 0 12 Apr 2021
Diagnostic Captioning: A Survey John Pavlopoulos Vasiliki Kougia Ion Androutsopoulos D. Papamichail 3DV MedIm 89 26 0 18 Jan 2021
Towards Overcoming False Positives in Visual Relationship Detection Daisheng Jin Xiao Ma Chongzhi Zhang Yizhuo Zhou Jiashu Tao ... Haiyu Zhao Shuai Yi Zhoujun Li Xianglong Liu Hongsheng Li 17 5 0 23 Dec 2020
AutoCaption: Image Captioning with Neural Architecture Search Xinxin Zhu Weining Wang Longteng Guo Jing Liu 24 9 0 16 Dec 2020
Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network Jiayi Ji Yunpeng Luo Xiaoshuai Sun Fuhai Chen Gen Luo Yongjian Wu Yue Gao Rongrong Ji ViT 41 170 0 13 Dec 2020
Curious Case of Language Generation Evaluation Metrics: A Cautionary Tale Ozan Caglayan Pranava Madhyastha Lucia Specia ELM 37 35 0 26 Oct 2020
Towards Unique and Informative Captioning of Images Zeyu Wang Berthy T. Feng Karthik Narasimhan Olga Russakovsky 17 37 0 08 Sep 2020
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces Yatin Nandwani Deepanshu Jindal Mausam Parag Singla 16 13 0 27 Aug 2020
Explore and Explain: Self-supervised Navigation and Recounting Roberto Bigazzi Federico Landi Marcella Cornia S. Cascianelli Lorenzo Baraldi Rita Cucchiara EgoV LM&Ro 8 17 0 14 Jul 2020
Non-Autoregressive Image Captioning with Counterfactuals-Critical Multi-Agent Learning Longteng Guo Jing Liu Xinxin Zhu Xingjian He Jie Jiang Hanqing Lu BDL 14 56 0 10 May 2020
AIBench Scenario: Scenario-distilling AI Benchmarking Wanling Gao Fei Tang Jianfeng Zhan Xu Wen Lei Wang Zheng Cao Chuanxin Lan Chunjie Luo Xiaoli Liu Zihan Jiang 21 14 0 06 May 2020
AIBench Training: Balanced Industry-Standard AI Training Benchmarking Fei Tang Wanling Gao Jianfeng Zhan Chuanxin Lan Xu Wen ... Yatao Li Junchao Shao Zhenyu Wang Xiaoyu Wang Hainan Ye 25 3 0 30 Apr 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning Longteng Guo Jing Liu Xinxin Zhu Peng Yao Shichen Lu Hanqing Lu ViT 114 189 0 19 Mar 2020
MRRC: Multiple Role Representation Crossover Interpretation for Image Captioning With R-CNN Feature Distribution Composition (FDC) C. Sur 23 16 0 15 Feb 2020
Personalizing Fast-Forward Videos Based on Visual and Textual Features from Social Network W. Ramos M. Silva Edson Roteia Araujo Junior Alan C. Neves Erickson R. Nascimento 14 6 0 29 Dec 2019
Meshed-Memory Transformer for Image Captioning Marcella Cornia Matteo Stefanini Lorenzo Baraldi Rita Cucchiara 14 868 0 17 Dec 2019
Keep it Consistent: Topic-Aware Storytelling from an Image Stream via Iterative Multi-agent Communication Ruize Wang Zhongyu Wei Ying Cheng Piji Li Haijun Shan Ji Zhang Qi Zhang Xuanjing Huang VGen DiffM 15 13 0 11 Nov 2019
Semantic Object Accuracy for Generative Text-to-Image Synthesis Tobias Hinz Stefan Heinrich S. Wermter EGVM 24 158 0 29 Oct 2019
Compositional Generalization in Image Captioning Mitja Nikolaus Mostafa Abdou Matthew Lamm Rahul Aralikatte Desmond Elliott CoGe 21 49 0 10 Sep 2019
Aligning Linguistic Words and Visual Semantic Units for Image Captioning Longteng Guo Jing Liu Jinhui Tang Jiangwei Li W. Luo Hanqing Lu 14 102 0 06 Aug 2019
Physical Cue based Depth-Sensing by Color Coding with Deaberration Network Nao Mishima Tatsuo Kozakaya Akihisa Moriya R. Okada S. Hiura 3DV 26 3 0 01 Aug 2019
Predicting Motion of Vulnerable Road Users using High-Definition Maps and Efficient ConvNets Fang-Chieh Chou Tsung-Han Lin Henggang Cui Vladan Radosavljevic Thi Nguyen Tzu-Kuo Huang Matthew Niedoba J. Schneider Nemanja Djuric 8 58 0 20 Jun 2019
End-to-End Video Captioning Silvio Olivastri Gurkirt Singh Fabio Cuzzolin 16 18 0 04 Apr 2019
Visual Entailment: A Novel Task for Fine-Grained Image Understanding Ning Xie Farley Lai Derek Doran Asim Kadav CoGe 31 321 0 20 Jan 2019
Pre-gen metrics: Predicting caption quality metrics without generating captions Marc Tanti Albert Gatt K. Camilleri 14 2 0 12 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning Md. Zakir Hossain Ferdous Sohel M. Shiratuddin Hamid Laga VLM 3DV 28 760 0 06 Oct 2018
Context-Dependent Diffusion Network for Visual Relationship Detection Zhen Cui Chunyan Xu Wenming Zheng Jian Yang GNN 12 50 0 11 Sep 2018
LUCSS: Language-based User-customized Colourization of Scene Sketches C. Zou Haoran Mo Ruofei Du Xing Wu Chengying Gao Hongbo Fu 22 8 0 30 Aug 2018
Exploring the Applications of Faster R-CNN and Single-Shot Multi-box Detection in a Smart Nursery Domain S. Phon-Amnuaisuk K. Murata P. Pavarangkoon Kazunori Yamamoto Takamichi Mizuhara ObjD 11 11 0 27 Aug 2018
Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features Xu Yang Hanwang Zhang Jianfei Cai 42 74 0 01 Aug 2018