ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.08822
  4. Cited By
SPICE: Semantic Propositional Image Caption Evaluation

SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
    EGVM
ArXiv (abs)PDFHTML

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 1,002 papers shown
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures
  for Image Captioning Models
JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models
Yuiga Wada
Kanta Kaneda
Komei Sugiura
236
6
0
07 Nov 2023
Multitask Multimodal Prompted Training for Interactive Embodied Task
  Completion
Multitask Multimodal Prompted Training for Interactive Embodied Task CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Georgios Pantazopoulos
Malvina Nikandrou
Amit Parekh
Bhathiya Hemanthage
Arash Eshghi
Ioannis Konstas
Verena Rieser
Oliver Lemon
Alessandro Suglia
LM&Ro
199
10
0
07 Nov 2023
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
LLM4Drive: A Survey of Large Language Models for Autonomous Driving
Zhenjie Yang
Xiaosong Jia
Guoying Gu
Junchi Yan
ELM
604
171
0
02 Nov 2023
CapsFusion: Rethinking Image-Text Data at Scale
CapsFusion: Rethinking Image-Text Data at ScaleComputer Vision and Pattern Recognition (CVPR), 2023
Qiying Yu
Quan-Sen Sun
Xiaosong Zhang
Yufeng Cui
Fan Zhang
Yue Cao
Xinlong Wang
Jingjing Liu
VLM
371
88
0
31 Oct 2023
Video-Helpful Multimodal Machine Translation
Video-Helpful Multimodal Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yihang Li
Shuichiro Shimizu
Chenhui Chu
Sadao Kurohashi
Wei Li
175
2
0
31 Oct 2023
Generating Context-Aware Natural Answers for Questions in 3D Scenes
Generating Context-Aware Natural Answers for Questions in 3D ScenesBritish Machine Vision Conference (BMVC), 2023
Mohammed Munzer Dwedari
Matthias Niessner
Dave Zhenyu Chen
203
6
0
30 Oct 2023
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative
  Understanding
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Lixing Zhu
Runcong Zhao
Lin Gui
Yulan He
251
10
0
28 Oct 2023
An Early Evaluation of GPT-4V(ision)
An Early Evaluation of GPT-4V(ision)
Yang Wu
Shilong Wang
Hao Yang
Tian Zheng
Hongbo Zhang
Yanyan Zhao
Bing Qin
MLLMELM
193
48
0
25 Oct 2023
Evaluating, Understanding, and Improving Constrained Text Generation for
  Large Language Models
Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models
Xiang Chen
Xiaojun Wan
184
2
0
25 Oct 2023
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive
  Survey and Evaluation
Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation
Yinjie Lei
Zixuan Wang
Feng Chen
Guoqing Wang
Peng Wang
Yang Yang
276
17
0
24 Oct 2023
CLAIR: Evaluating Image Captions with Large Language Models
CLAIR: Evaluating Image Captions with Large Language Models
David M. Chan
Suzanne Petryk
Joseph E. Gonzalez
Trevor Darrell
John F. Canny
198
36
0
19 Oct 2023
Evaluating the Fairness of Discriminative Foundation Models in Computer
  Vision
Evaluating the Fairness of Discriminative Foundation Models in Computer VisionAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023
Junaid Ali
Matthäus Kleindessner
F. Wenzel
Kailash Budhathoki
Volkan Cevher
Chris Russell
VLM
248
15
0
18 Oct 2023
Bounding and Filling: A Fast and Flexible Framework for Image Captioning
Bounding and Filling: A Fast and Flexible Framework for Image Captioning
Zheng Ma
Changxin Wang
Bo Huang
Zi-Yue Zhu
Jianbing Zhang
187
3
0
15 Oct 2023
Analyzing and Mitigating Object Hallucination in Large Vision-Language
  Models
Analyzing and Mitigating Object Hallucination in Large Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023
Yiyang Zhou
Chenhang Cui
Jaehong Yoon
Linjun Zhang
Zhun Deng
Chelsea Finn
Mohit Bansal
Huaxiu Yao
MLLM
369
268
0
01 Oct 2023
Self-supervised Cross-view Representation Reconstruction for Change
  Captioning
Self-supervised Cross-view Representation Reconstruction for Change CaptioningIEEE International Conference on Computer Vision (ICCV), 2023
Yunbin Tu
Liang Li
Filippos Christianos
Zheng-Jun Zha
Zhibin Li
Qingming Huang
SSL
195
39
0
28 Sep 2023
Targeted Image Data Augmentation Increases Basic Skills Captioning
  Robustness
Targeted Image Data Augmentation Increases Basic Skills Captioning RobustnessIEEE Games Entertainment Media Conference (IEEE GEM), 2023
Valentin Barriere
Felipe del Rio
Andres Carvallo De Ferari
Carlos Aspillaga
Eugenio Herrera-Berg
Cristian Buc Calderon
DiffM
233
0
0
27 Sep 2023
MindGPT: Interpreting What You See with Non-invasive Brain Recordings
MindGPT: Interpreting What You See with Non-invasive Brain RecordingsIEEE Transactions on Image Processing (IEEE TIP), 2023
Jiaxuan Chen
Yu Qi
Yueming Wang
Gang Pan
267
12
0
27 Sep 2023
Weakly-supervised Automated Audio Captioning via text only training
Weakly-supervised Automated Audio Captioning via text only training
Theodoros Kouzelis
Vassilis Katsouros
CLIP
235
12
0
21 Sep 2023
ContextRef: Evaluating Referenceless Metrics For Image Description
  Generation
ContextRef: Evaluating Referenceless Metrics For Image Description GenerationInternational Conference on Learning Representations (ICLR), 2023
Elisa Kreiss
E. Zelikman
Christopher Potts
Nick Haber
246
5
0
21 Sep 2023
Toward Unified Controllable Text Generation via Regular Expression
  Instruction
Toward Unified Controllable Text Generation via Regular Expression InstructionInternational Joint Conference on Natural Language Processing (IJCNLP), 2023
Xin Zheng
Hongyu Lin
Xianpei Han
Le Sun
223
7
0
19 Sep 2023
Predicate Classification Using Optimal Transport Loss in Scene Graph
  Generation
Predicate Classification Using Optimal Transport Loss in Scene Graph Generation
Sorachi Kurita
Satoshi Oyama
Itsuki Noda
OT
170
0
0
19 Sep 2023
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision
Synth-AC: Enhancing Audio Captioning with Synthetic Supervision
Feiyang Xiao
Qiaoxi Zhu
Jian Guan
Xubo Liu
Haohe Liu
Kejia Zhang
Wenwu Wang
177
2
0
18 Sep 2023
Viewpoint Integration and Registration with Vision Language Foundation
  Model for Image Change Understanding
Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding
Xiaonan Lu
Jianlong Yuan
Ruigang Niu
Yuan Hu
Fan Wang
152
3
0
15 Sep 2023
Towards Practical and Efficient Image-to-Speech Captioning with
  Vision-Language Pre-training and Multi-modal Tokens
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minsu Kim
J. Choi
Soumi Maiti
Jeong Hun Yeo
Shinji Watanabe
Y. Ro
VLM
203
8
0
15 Sep 2023
Learning to Predict Concept Ordering for Common Sense Generation
Learning to Predict Concept Ordering for Common Sense GenerationInternational Joint Conference on Natural Language Processing (IJCNLP), 2023
Tianhui Zhang
Danushka Bollegala
Bei Peng
LRM
127
3
0
12 Sep 2023
Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image
  Captioning
Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image CaptioningInternational Conference on Language Resources and Evaluation (LREC), 2023
Guisheng Liu
Yi Li
Zhengcong Fei
Haiyan Fu
Xiangyang Luo
Yanqing Guo
VLMDiffM
262
16
0
10 Sep 2023
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical
  Learning
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical LearningComputer Vision and Pattern Recognition (CVPR), 2023
Wei Suo
Mengyang Sun
Weisong Liu
Yi-Meng Gao
Peifeng Wang
Yanning Zhang
Qi Wu
LRM
204
11
0
05 Sep 2023
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning
NICE: CVPR 2023 Challenge on Zero-shot Image Captioning
Taehoon Kim
Pyunghwan Ahn
Sangyun Kim
Sihaeng Lee
Mark A Marsden
...
Yujin Wang
Yimu Wang
Tiancheng Gu
Xingchang Lv
Mingmao Sun
VLM
299
8
0
05 Sep 2023
DeViL: Decoding Vision features into Language
DeViL: Decoding Vision features into Language
Meghal Dani
Isabel Rio-Torto
Stephan Alaniz
Zeynep Akata
VLM
196
11
0
04 Sep 2023
CoNeTTE: An efficient Audio Captioning system leveraging multiple
  datasets with Task Embedding
CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding
Etienne Labbé
Thomas Pellegrini
J. Pinquier
288
21
0
01 Sep 2023
Towards Addressing the Misalignment of Object Proposal Evaluation for
  Vision-Language Tasks via Semantic Grounding
Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic GroundingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Joshua Forster Feinglass
Yezhou Yang
177
2
0
01 Sep 2023
Killing two birds with one stone: Can an audio captioning system also be
  used for audio-text retrieval?
Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?
Etienne Labbé
Thomas Pellegrini
J. Pinquier
169
5
0
29 Aug 2023
Explaining Vision and Language through Graphs of Events in Space and
  Time
Explaining Vision and Language through Graphs of Events in Space and Time
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
VLM
188
4
0
29 Aug 2023
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual
  Captioning
MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual CaptioningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023
Bang-ju Yang
Fenglin Liu
X. Wu
Yaowei Wang
Xu Sun
Yuexian Zou
VLMCLIP
227
20
0
25 Aug 2023
With a Little Help from your own Past: Prototypical Memory Networks for
  Image Captioning
With a Little Help from your own Past: Prototypical Memory Networks for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2023
Manuele Barraco
Sara Sarto
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
VLM
186
30
0
23 Aug 2023
CgT-GAN: CLIP-guided Text GAN for Image Captioning
CgT-GAN: CLIP-guided Text GAN for Image CaptioningACM Multimedia (ACM MM), 2023
Jiarui Yu
Haoran Li
Y. Hao
B. Zhu
Tong Xu
Xiangnan He
VLMCLIP
229
24
0
23 Aug 2023
Audio Difference Captioning Utilizing Similarity-Discrepancy
  Disentanglement
Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement
Daiki Takeuchi
Yasunori Ohishi
Daisuke Niizumi
Noboru Harada
K. Kashino
221
10
0
23 Aug 2023
Explore and Tell: Embodied Visual Captioning in 3D Environments
Explore and Tell: Embodied Visual Captioning in 3D EnvironmentsIEEE International Conference on Computer Vision (ICCV), 2023
Anwen Hu
Shizhe Chen
Liang Zhang
Qin Jin
LM&Ro
199
3
0
21 Aug 2023
Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language
  Tasks
Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks
Fawaz Sammani
Nikos Deligiannis
187
6
0
17 Aug 2023
Informative Scene Graph Generation via Debiasing
Informative Scene Graph Generation via DebiasingInternational Journal of Computer Vision (IJCV), 2023
Lianli Gao
Xinyu Lyu
Yuyu Guo
Yuxuan Hu
Yuanyou Li
Lu Xu
Hengtao Shen
Jingkuan Song
199
5
0
10 Aug 2023
The All-Seeing Project: Towards Panoptic Visual Recognition and
  Understanding of the Open World
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open WorldInternational Conference on Learning Representations (ICLR), 2023
Weiyun Wang
Min Shi
Qingyun Li
Wen Wang
Zhenhang Huang
...
Zhiguo Cao
Yushi Chen
Tong Lu
Jifeng Dai
Yu Qiao
LRMMLLM
270
118
0
03 Aug 2023
Transferable Decoding with Visual Entities for Zero-Shot Image
  Captioning
Transferable Decoding with Visual Entities for Zero-Shot Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2023
Junjie Fei
Teng Wang
Jinrui Zhang
Zhenyu He
Chengjie Wang
Feng Zheng
VLM
169
65
0
31 Jul 2023
Exploring Annotation-free Image Captioning with Retrieval-augmented
  Pseudo Sentence Generation
Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence GenerationACM Multimedia Asia (MA), 2023
Zhiyuan Li
Dongnan Liu
Heng Wang
Chaoyi Zhang
Weidong (Tom) Cai
RALM
190
2
0
27 Jul 2023
Set-level Guidance Attack: Boosting Adversarial Transferability of
  Vision-Language Pre-training Models
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Dong Lu
Zhiqiang Wang
Teng Wang
Weili Guan
Hongchang Gao
Feng Zheng
AAML
273
121
0
26 Jul 2023
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for
  Navigation Instruction Generation
Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation
Haitian Zeng
Xiaohan Wang
Wenguan Wang
Yi Yang
268
10
0
25 Jul 2023
Improving Multimodal Datasets with Image Captioning
Improving Multimodal Datasets with Image CaptioningNeural Information Processing Systems (NeurIPS), 2023
Thao Nguyen
S. Gadre
Gabriel Ilharco
Sewoong Oh
Ludwig Schmidt
VLM
263
125
0
19 Jul 2023
Open Scene Understanding: Grounded Situation Recognition Meets Segment
  Anything for Helping People with Visual Impairments
Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments
R. Liu
Kailai Li
Kunyu Peng
Junwei Zheng
Ke Cao
Yufan Chen
Kailun Yang
Rainer Stiefelhagen
142
22
0
15 Jul 2023
Linear Alignment of Vision-language Models for Image Captioning
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIPVLM
486
2
0
10 Jul 2023
Transformers in Healthcare: A Survey
Transformers in Healthcare: A Survey
Subhash Nerella
S. Bandyopadhyay
Jiaqing Zhang
Miguel Contreras
Scott Siegel
...
Jessica Sena
B. Shickel
A. Bihorac
Kia Khezeli
Parisa Rashidi
MedImAI4CE
267
89
0
30 Jun 2023
ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple
  Oracles
ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple OraclesNatural Language Processing and Chinese Computing (NLPCC), 2023
Haoqin Tu
Bowen Yang
Xianfeng Zhao
173
7
0
29 Jun 2023
Previous
123...678...192021
Next
Page 7 of 21
Pageof 21