SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 1,002 papers shown

Local Interpretations for Explainable Natural Language Processing: A SurveyACM Computing Surveys (CSUR), 2021

418

20 Mar 2021

Constrained Text Generation with Global Guidance -- Case Study on CommonGen

Yue Zhang

179

12 Mar 2021

Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and VisionInternational Journal of Computer Vision (IJCV), 2021

Andrew Shin

Masato Ishii

T. Narihira

310

06 Mar 2021

Causal Attention for Vision-Language TasksComputer Vision and Pattern Recognition (CVPR), 2021

Jianfei Cai

239

195

05 Mar 2021

CrossMap Transformer: A Crossmodal Masked Path Transformer Using Double Back-Translation for Vision-and-Language NavigationIEEE Robotics and Automation Letters (RA-L), 2021

A. Magassouba

K. Sugiura

Hisashi Kawai

148

01 Mar 2021

Investigating Local and Global Information for Automated Audio Captioning with Transfer LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2021

171

23 Feb 2021

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2021

463

276

20 Feb 2021

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual ConceptsComputer Vision and Pattern Recognition (CVPR), 2021

1.2K

1,370

17 Feb 2021

Improved Bengali Image Captioning via deep convolutional neural network based encoder-decoder model

126

14 Feb 2021

The Role of the Input in Natural Language Video DescriptionIEEE transactions on multimedia (TMM), 2020

S. Cascianelli

G. Costante

Alessandro Devo

Thomas Alessandro Ciarfuglia

P. Valigi

M. L. Fravolini

154

09 Feb 2021

Unifying Vision-and-Language Tasks via Text GenerationInternational Conference on Machine Learning (ICML), 2021

614

611

04 Feb 2021

The Role of Syntactic Planning in Compositional Image CaptioningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021

Emanuele Bugliarello

Desmond Elliott

CoGe

107

28 Jan 2021

On the Evaluation of Vision-and-Language Navigation InstructionsConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021

487

26 Jan 2021

ECOL-R: Encouraging Copying in Novel Object Captioning with Reinforcement LearningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2021

158

25 Jan 2021

Fast Sequence Generation with Multi-Agent Reinforcement Learning

Jing Liu

181

24 Jan 2021

Macroscopic Control of Text Generation for Image Captioning

Zhangzi Zhu

Tianlei Wang

Hong Qu

196

20 Jan 2021

Diagnostic Captioning: A SurveyKnowledge and Information Systems (KAIS), 2021

237

18 Jan 2021

Kimera: from SLAM to Spatial Perception with 3D Dynamic Scene Graphs

448

300

18 Jan 2021

Dual-Level Collaborative Transformer for Image CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2021

Yunpeng Luo

Jiayi Ji

Xiaoshuai Sun

Liujuan Cao

Yongjian Wu

247

329

16 Jan 2021

On-the-Fly Attention Modulation for Neural GenerationFindings (Findings), 2021

Yejin Choi

298

02 Jan 2021

Text-Free Image-to-Speech Synthesis Using Learned Segmental UnitsAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

191

31 Dec 2020

Image-to-Image Retrieval by Learning Similarity between Scene GraphsAAAI Conference on Artificial Intelligence (AAAI), 2020

220

29 Dec 2020

WEmbSim: A Simple yet Effective Metric for Image CaptioningInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2020

Wei Liu

113

24 Dec 2020

LCEval: Learned Composite Metric for Caption EvaluationInternational Journal of Computer Vision (IJCV), 2019

Wei Liu

136

24 Dec 2020

SubICap: Towards Subword-informed Image CaptioningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020

Naeha Sharif

Bennamoun

Wei Liu

Syed Afaq Ali Shah

112

24 Dec 2020

Lexically-constrained Text Generation through Commonsense Knowledge Extraction and Injection

Yikang Li

P. Goel

Varsha Kuppur Rajendra

Kaixin Ma

208

19 Dec 2020

AutoCaption: Image Captioning with Neural Architecture Search

Xinxin Zhu

Weining Wang

Longteng Guo

Jing Liu

277

16 Dec 2020

Intrinsic Image Captioning Evaluation

Chao Zeng

Sam Kwong

103

14 Dec 2020

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer NetworkAAAI Conference on Artificial Intelligence (AAAI), 2020

Jiayi Ji

Yongjian Wu

206

201

13 Dec 2020

MiniVLM: A Smaller and Faster Vision-Language Model

Xiaowei Hu

Zicheng Liu

265

13 Dec 2020

Image Captioning with Context-Aware Auxiliary GuidanceAAAI Conference on Artificial Intelligence (AAAI), 2020

224

10 Dec 2020

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCapsAAAI Conference on Artificial Intelligence (AAAI), 2020

Qi Zhu

Chenyu Gao

Peng Wang

Qi Wu

208

09 Dec 2020

Towards Annotation-Free Evaluation of Cross-Lingual Image Captioning

217

09 Dec 2020

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption

Lei Zhang

266

159

08 Dec 2020

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps

Zhaokai Wang

Renda Bao

Qi Wu

Si Liu

349

07 Dec 2020

An Enhanced Knowledge Injection Model for Commonsense GenerationInternational Conference on Computational Linguistics (COLING), 2020

Xuanjing Huang

262

01 Dec 2020

Language-Driven Region Pointer Advancement for Controllable Image CaptioningInternational Conference on Computational Linguistics (COLING), 2020

Annika Lindh

R. Ross

John D. Kelleher

122

30 Nov 2020

A Comprehensive Review on Recent Methods and Challenges of Video Description

203

30 Nov 2020

FFCI: A Framework for Interpretable Automatic Evaluation of SummarizationJournal of Artificial Intelligence Research (JAIR), 2020

261

27 Nov 2020

Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

Jianwei Yang

210

18 Nov 2020

Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze

217

09 Nov 2020

A Gold Standard Methodology for Evaluating Accuracy in Data-To-Text Systems

Craig Thomson

Ehud Reiter

140

08 Nov 2020

Diverse Image Captioning with Context-Object Split Latent SpacesNeural Information Processing Systems (NeurIPS), 2020

Shweta Mahajan

Stefan Roth

206

02 Nov 2020

DeepOpht: Medical Report Generation for Retinal Images via Deep Models and Visual ExplanationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2020

...

213

01 Nov 2020

Fusion Models for Improved Visual Captioning

197

28 Oct 2020

Pre-training Text-to-Text Transformers for Concept-centric Common SenseInternational Conference on Learning Representations (ICLR), 2020

Xiang Ren

258

24 Oct 2020

WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information

An Tran

Konstantinos Drossos

Maria Sandsten

208

21 Oct 2020

A Survey on Deep Learning and Explainability for Automatic Report Generation from Medical ImagesACM Computing Surveys (ACM CSUR), 2020

327

20 Oct 2020

Multimodal Research in Vision and Language: A Review of Current and Emerging Trends

Roger Zimmermann

282

19 Oct 2020

Positioning yourself in the maze of Neural Text Generation: A Task-Agnostic Survey

Khyathi Chandu

A. Black

261

14 Oct 2020