ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.08822
  4. Cited By
SPICE: Semantic Propositional Image Caption Evaluation

SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
    EGVM
ArXiv (abs)PDFHTML

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 1,002 papers shown
Cross-Domain Image Captioning with Discriminative Finetuning
Cross-Domain Image Captioning with Discriminative FinetuningComputer Vision and Pattern Recognition (CVPR), 2023
Roberto Dessì
Michele Bevilacqua
Eleonora Gualdoni
Nathanaël Carraz Rakotonirina
Francesca Franzon
Marco Baroni
CLIP
248
26
0
04 Apr 2023
Prefix tuning for automated audio captioning
Prefix tuning for automated audio captioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
356
53
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for
  Audio-Language Multimodal Research
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal ResearchIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
340
313
0
30 Mar 2023
AutoAD: Movie Description in Context
AutoAD: Movie Description in ContextComputer Vision and Pattern Recognition (CVPR), 2023
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
258
49
0
29 Mar 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
Hierarchical Video-Moment Retrieval and Step-CaptioningComputer Vision and Pattern Recognition (CVPR), 2023
Abhaysinh Zala
Jaemin Cho
Satwik Kottur
Xilun Chen
Barlas Ouguz
Yasher Mehdad
Joey Tianyi Zhou
3DV
286
85
0
29 Mar 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning
  Evaluation
Positive-Augmented Contrastive Learning for Image and Video Captioning EvaluationComputer Vision and Pattern Recognition (CVPR), 2023
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
339
87
0
21 Mar 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation
  with Question Answering
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question AnsweringIEEE International Conference on Computer Vision (ICCV), 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
EGVM
354
348
0
21 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
175
1
0
17 Mar 2023
Lana: A Language-Capable Navigator for Instruction Following and
  Generation
Lana: A Language-Capable Navigator for Instruction Following and GenerationComputer Vision and Pattern Recognition (CVPR), 2023
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAGLM&Ro
246
56
0
15 Mar 2023
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning
PR-MCS: Perturbation Robust Metric for MultiLingual Image CaptioningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yongil Kim
Yerin Hwang
Hyeongu Yun
Seunghyun Yoon
Trung Bui
Kyomin Jung
278
7
0
15 Mar 2023
FactReranker: Fact-guided Reranker for Faithful Radiology Report
  Summarization
FactReranker: Fact-guided Reranker for Faithful Radiology Report Summarization
Qianqian Xie
Jiayu Zhou
Yifan Peng
Fei Wang
HILMMedIm
241
16
0
15 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and
  Multilingual Natural Language Generation
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language GenerationIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
273
12
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image
  Captioning
Learning Combinatorial Prompts for Universal Controllable Image CaptioningInternational Journal of Computer Vision (IJCV), 2023
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
200
12
0
11 Mar 2023
Refined Vision-Language Modeling for Fine-grained Multi-modal
  Pre-training
Refined Vision-Language Modeling for Fine-grained Multi-modal Pre-training
Lisai Zhang
Qingcai Chen
Zhijian Chen
Yunpeng Han
Zhonghua Li
Bo Zhao
VLM
154
1
0
09 Mar 2023
Interpretable Visual Question Answering Referring to Outside Knowledge
Interpretable Visual Question Answering Referring to Outside KnowledgeInternational Conference on Information Photonics (ICIP), 2023
He Zhu
Ren Togo
Takahiro Ogawa
Miki Haseyama
163
1
0
08 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
Graph Neural Networks in Vision-Language Image Understanding: A SurveyThe Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
323
34
0
07 Mar 2023
Neighborhood Contrastive Transformer for Change Captioning
Neighborhood Contrastive Transformer for Change CaptioningIEEE transactions on multimedia (IEEE TMM), 2023
Yunbin Tu
Liang Li
Li Su
Kelvin Lu
Qin Huang
ViT
191
27
0
06 Mar 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only
  Training
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only TrainingInternational Conference on Learning Representations (ICLR), 2023
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
229
119
0
06 Mar 2023
Comparative study of Transformer and LSTM Network with attention
  mechanism on Image Captioning
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
183
9
0
05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based
  Polishing
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based PolishingComputer Vision and Pattern Recognition (CVPR), 2023
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDLDiffM
225
58
0
04 Mar 2023
Language Is Not All You Need: Aligning Perception with Language Models
Language Is Not All You Need: Aligning Perception with Language ModelsNeural Information Processing Systems (NeurIPS), 2023
Shaohan Huang
Li Dong
Wenhui Wang
Y. Hao
Saksham Singhal
...
Johan Bjorck
Vishrav Chaudhary
Subhojit Som
Xia Song
Furu Wei
VLMLRMMLLM
345
680
0
27 Feb 2023
Learning Visual Representations via Language-Guided Sampling
Learning Visual Representations via Language-Guided SamplingComputer Vision and Pattern Recognition (CVPR), 2023
Mohamed El Banani
Karan Desai
Justin Johnson
SSLVLM
407
36
0
23 Feb 2023
Test-Time Distribution Normalization for Contrastively Learned
  Vision-language Models
Test-Time Distribution Normalization for Contrastively Learned Vision-language ModelsNeural Information Processing Systems (NeurIPS), 2023
Yi Zhou
Juntao Ren
Fengyu Li
Ramin Zabih
Ser-Nam Lim
VLM
250
21
0
22 Feb 2023
Retrieval-augmented Image Captioning
Retrieval-augmented Image CaptioningConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
R. Ramos
Desmond Elliott
Bruno Martins
VLM
193
43
0
16 Feb 2023
Towards Local Visual Modeling for Image Captioning
Towards Local Visual Modeling for Image CaptioningPattern Recognition (Pattern Recogn.), 2023
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Weihao Ye
Rongrong Ji
ViT
242
107
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image
  Captioning
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
192
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
KENGIC: KEyword-driven and N-Gram Graph based Image CaptioningInternational Conference on Digital Image Computing: Techniques and Applications (DICTA), 2022
Brandon Birmingham
A. Muscat
117
1
0
07 Feb 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image CaptioningPattern Recognition (Pattern Recogn.), 2023
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViTVLM
483
1
0
03 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Style-Aware Contrastive Learning for Multi-Style Image CaptioningFindings (Findings), 2023
Yucheng Zhou
Guodong Long
151
28
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled
  Data
Semi-Supervised Image Captioning by Adversarially Propagating Labeled DataIEEE Access (IEEE Access), 2023
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSLVLM
155
10
0
26 Jan 2023
Towards a Unified Model for Generating Answers and Explanations in
  Visual Question Answering
Towards a Unified Model for Generating Answers and Explanations in Visual Question AnsweringFindings (Findings), 2023
Chenxi Whitehouse
Tillman Weyde
Pranava Madhyastha
LRM
271
4
0
25 Jan 2023
Visual Semantic Relatedness Dataset for Image Captioning
Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir
Francesc Moreno-Noguer
Lluís Padró
CoGeVLM
201
4
0
20 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
Embodied Agents for Efficient Exploration and Smart Scene DescriptionIEEE International Conference on Robotics and Automation (ICRA), 2023
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
178
9
0
17 Jan 2023
Advances in Medical Image Analysis with Vision Transformers: A
  Comprehensive Review
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Reza Azad
Amirhossein Kazerouni
Moein Heidari
Ehsan Khodapanah Aghdam
Amir Molaei
Yiwei Jia
Abin Jose
Rijo Roy
Dorit Merhof
MedImViT
381
308
0
09 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
467
0
0
05 Jan 2023
Do DALL-E and Flamingo Understand Each Other?
Do DALL-E and Flamingo Understand Each Other?IEEE International Conference on Computer Vision (ICCV), 2022
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
226
14
0
23 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
371
86
0
20 Dec 2022
MetaCLUE: Towards Comprehensive Visual Metaphors Research
MetaCLUE: Towards Comprehensive Visual Metaphors ResearchComputer Vision and Pattern Recognition (CVPR), 2022
Arjun Reddy Akula
Brenda S. Driscoll
P. Narayana
Soravit Changpinyo
Zhi-xuan Jia
...
Sugato Basu
Leonidas Guibas
William T. Freeman
Yuanzhen Li
Varun Jampani
CLIPVLM
201
42
0
19 Dec 2022
Efficient Image Captioning for Edge Devices
Efficient Image Captioning for Edge DevicesAAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiangrong Xie
Hangzai Luo
Qinglin Cheng
Jihao Wu
Mingbo Jia
Linlin Li
VLMCLIP
210
39
0
18 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level
  Natural Language Explanations
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
221
6
0
08 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck
  of Reinforcement Learning
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement LearningIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
229
11
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Semantic-Conditional Diffusion Networks for Image CaptioningComputer Vision and Pattern Recognition (CVPR), 2022
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
237
115
0
06 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
Towards Generating Diverse Audio Captions via Adversarial TrainingIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
296
6
0
05 Dec 2022
Controllable Image Captioning via Prompting
Controllable Image Captioning via PromptingAAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
232
36
0
04 Dec 2022
Uncertainty-Aware Image Captioning
Uncertainty-Aware Image CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2022
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
194
18
0
30 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
CLID: Controlled-Length Image Descriptions with Limited DataIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Elad Hirsch
A. Tal
VLM3DV
219
5
0
27 Nov 2022
Aesthetically Relevant Image Captioning
Aesthetically Relevant Image CaptioningAAAI Conference on Artificial Intelligence (AAAI), 2022
Zhipeng Zhong
Fei Zhou
Guoping Qiu
132
15
0
25 Nov 2022
Aligning Source Visual and Target Language Domains for Unpaired Video
  Captioning
Aligning Source Visual and Target Language Domains for Unpaired Video CaptioningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Fenglin Liu
Xian Wu
Chenyu You
Shen Ge
Yuexian Zou
Xu Sun
249
30
0
22 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffMVLM
270
32
0
21 Nov 2022
VER: Unifying Verbalizing Entities and Relations
VER: Unifying Verbalizing Entities and RelationsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jie Huang
Kevin Chen-Chuan Chang
323
1
0
20 Nov 2022
Previous
123...8910...192021
Next
Page 9 of 21
Pageof 21