Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1607.08822
Cited By
SPICE: Semantic Propositional Image Caption Evaluation
29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SPICE: Semantic Propositional Image Caption Evaluation"
50 / 1,002 papers shown
Cross-Domain Image Captioning with Discriminative Finetuning
Computer Vision and Pattern Recognition (CVPR), 2023
Roberto Dessì
Michele Bevilacqua
Eleonora Gualdoni
Nathanaël Carraz Rakotonirina
Francesca Franzon
Marco Baroni
CLIP
248
26
0
04 Apr 2023
Prefix tuning for automated audio captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
356
53
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
340
313
0
30 Mar 2023
AutoAD: Movie Description in Context
Computer Vision and Pattern Recognition (CVPR), 2023
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
258
49
0
29 Mar 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
Computer Vision and Pattern Recognition (CVPR), 2023
Abhaysinh Zala
Jaemin Cho
Satwik Kottur
Xilun Chen
Barlas Ouguz
Yasher Mehdad
Joey Tianyi Zhou
3DV
286
85
0
29 Mar 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Computer Vision and Pattern Recognition (CVPR), 2023
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
339
87
0
21 Mar 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
IEEE International Conference on Computer Vision (ICCV), 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
EGVM
354
348
0
21 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
175
1
0
17 Mar 2023
Lana: A Language-Capable Navigator for Instruction Following and Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAG
LM&Ro
246
56
0
15 Mar 2023
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yongil Kim
Yerin Hwang
Hyeongu Yun
Seunghyun Yoon
Trung Bui
Kyomin Jung
278
7
0
15 Mar 2023
FactReranker: Fact-guided Reranker for Faithful Radiology Report Summarization
Qianqian Xie
Jiayu Zhou
Yifan Peng
Fei Wang
HILM
MedIm
241
16
0
15 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
273
12
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning
International Journal of Computer Vision (IJCV), 2023
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
200
12
0
11 Mar 2023
Refined Vision-Language Modeling for Fine-grained Multi-modal Pre-training
Lisai Zhang
Qingcai Chen
Zhijian Chen
Yunpeng Han
Zhonghua Li
Bo Zhao
VLM
154
1
0
09 Mar 2023
Interpretable Visual Question Answering Referring to Outside Knowledge
International Conference on Information Photonics (ICIP), 2023
He Zhu
Ren Togo
Takahiro Ogawa
Miki Haseyama
163
1
0
08 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
The Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
323
34
0
07 Mar 2023
Neighborhood Contrastive Transformer for Change Captioning
IEEE transactions on multimedia (IEEE TMM), 2023
Yunbin Tu
Liang Li
Li Su
Kelvin Lu
Qin Huang
ViT
191
27
0
06 Mar 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
International Conference on Learning Representations (ICLR), 2023
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
229
119
0
06 Mar 2023
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
183
9
0
05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Computer Vision and Pattern Recognition (CVPR), 2023
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
225
58
0
04 Mar 2023
Language Is Not All You Need: Aligning Perception with Language Models
Neural Information Processing Systems (NeurIPS), 2023
Shaohan Huang
Li Dong
Wenhui Wang
Y. Hao
Saksham Singhal
...
Johan Bjorck
Vishrav Chaudhary
Subhojit Som
Xia Song
Furu Wei
VLM
LRM
MLLM
345
680
0
27 Feb 2023
Learning Visual Representations via Language-Guided Sampling
Computer Vision and Pattern Recognition (CVPR), 2023
Mohamed El Banani
Karan Desai
Justin Johnson
SSL
VLM
407
36
0
23 Feb 2023
Test-Time Distribution Normalization for Contrastively Learned Vision-language Models
Neural Information Processing Systems (NeurIPS), 2023
Yi Zhou
Juntao Ren
Fengyu Li
Ramin Zabih
Ser-Nam Lim
VLM
250
21
0
22 Feb 2023
Retrieval-augmented Image Captioning
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
R. Ramos
Desmond Elliott
Bruno Martins
VLM
193
43
0
16 Feb 2023
Towards Local Visual Modeling for Image Captioning
Pattern Recognition (Pattern Recogn.), 2023
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Weihao Ye
Rongrong Ji
ViT
242
107
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
192
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2022
Brandon Birmingham
A. Muscat
117
1
0
07 Feb 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
Pattern Recognition (Pattern Recogn.), 2023
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViT
VLM
483
1
0
03 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Findings (Findings), 2023
Yucheng Zhou
Guodong Long
151
28
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
IEEE Access (IEEE Access), 2023
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSL
VLM
155
10
0
26 Jan 2023
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering
Findings (Findings), 2023
Chenxi Whitehouse
Tillman Weyde
Pranava Madhyastha
LRM
271
4
0
25 Jan 2023
Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir
Francesc Moreno-Noguer
Lluís Padró
CoGe
VLM
201
4
0
20 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
IEEE International Conference on Robotics and Automation (ICRA), 2023
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
178
9
0
17 Jan 2023
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Reza Azad
Amirhossein Kazerouni
Moein Heidari
Ehsan Khodapanah Aghdam
Amir Molaei
Yiwei Jia
Abin Jose
Rijo Roy
Dorit Merhof
MedIm
ViT
381
308
0
09 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
467
0
0
05 Jan 2023
Do DALL-E and Flamingo Understand Each Other?
IEEE International Conference on Computer Vision (ICCV), 2022
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
226
14
0
23 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
371
86
0
20 Dec 2022
MetaCLUE: Towards Comprehensive Visual Metaphors Research
Computer Vision and Pattern Recognition (CVPR), 2022
Arjun Reddy Akula
Brenda S. Driscoll
P. Narayana
Soravit Changpinyo
Zhi-xuan Jia
...
Sugato Basu
Leonidas Guibas
William T. Freeman
Yuanzhen Li
Varun Jampani
CLIP
VLM
201
42
0
19 Dec 2022
Efficient Image Captioning for Edge Devices
AAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiangrong Xie
Hangzai Luo
Qinglin Cheng
Jihao Wu
Mingbo Jia
Linlin Li
VLM
CLIP
210
39
0
18 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
221
6
0
08 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
229
11
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2022
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
237
115
0
06 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
296
6
0
05 Dec 2022
Controllable Image Captioning via Prompting
AAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
232
36
0
04 Dec 2022
Uncertainty-Aware Image Captioning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
194
18
0
30 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Elad Hirsch
A. Tal
VLM
3DV
219
5
0
27 Nov 2022
Aesthetically Relevant Image Captioning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zhipeng Zhong
Fei Zhou
Guoping Qiu
132
15
0
25 Nov 2022
Aligning Source Visual and Target Language Domains for Unpaired Video Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021
Fenglin Liu
Xian Wu
Chenyu You
Shen Ge
Yuexian Zou
Xu Sun
249
30
0
22 Nov 2022
Exploring Discrete Diffusion Models for Image Captioning
Zixin Zhu
Yixuan Wei
Jianfeng Wang
Zhe Gan
Zheng Zhang
Le Wang
G. Hua
Lijuan Wang
Zicheng Liu
Han Hu
DiffM
VLM
270
32
0
21 Nov 2022
VER: Unifying Verbalizing Entities and Relations
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Jie Huang
Kevin Chen-Chuan Chang
323
1
0
20 Nov 2022
Previous
1
2
3
...
8
9
10
...
19
20
21
Next
Page 9 of 21
Page
of 21
Go