Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1607.08822
Cited By
SPICE: Semantic Propositional Image Caption Evaluation
29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SPICE: Semantic Propositional Image Caption Evaluation"
50 / 1,005 papers shown
A-CAP: Anticipation Captioning with Commonsense Knowledge
Computer Vision and Pattern Recognition (CVPR), 2023
D. Vo
Quoc-An Luong
Akihiro Sugimoto
Hideki Nakayama
168
3
0
13 Apr 2023
Model-Agnostic Gender Debiased Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2023
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
349
25
0
07 Apr 2023
Graph Attention for Automated Audio Captioning
IEEE Signal Processing Letters (IEEE SPL), 2023
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Wenwu Wang
220
11
0
07 Apr 2023
Cross-Domain Image Captioning with Discriminative Finetuning
Computer Vision and Pattern Recognition (CVPR), 2023
Roberto Dessì
Michele Bevilacqua
Eleonora Gualdoni
Nathanaël Carraz Rakotonirina
Francesca Franzon
Marco Baroni
CLIP
254
28
0
04 Apr 2023
Prefix tuning for automated audio captioning
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Minkyu Kim
Kim Sung-Bin
Tae-Hyun Oh
375
54
0
30 Mar 2023
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Xinhao Mei
Chutong Meng
Haohe Liu
Qiuqiang Kong
Tom Ko
Chengqi Zhao
Mark D. Plumbley
Yuexian Zou
Wenwu Wang
358
336
0
30 Mar 2023
AutoAD: Movie Description in Context
Computer Vision and Pattern Recognition (CVPR), 2023
Tengda Han
Max Bain
Arsha Nagrani
Gül Varol
Weidi Xie
Andrew Zisserman
VGen
280
50
0
29 Mar 2023
Hierarchical Video-Moment Retrieval and Step-Captioning
Computer Vision and Pattern Recognition (CVPR), 2023
Abhaysinh Zala
Jaemin Cho
Satwik Kottur
Xilun Chen
Barlas Ouguz
Yasher Mehdad
Joey Tianyi Zhou
3DV
298
87
0
29 Mar 2023
Positive-Augmented Contrastive Learning for Image and Video Captioning Evaluation
Computer Vision and Pattern Recognition (CVPR), 2023
Sara Sarto
Manuele Barraco
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
348
90
0
21 Mar 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering
IEEE International Conference on Computer Vision (ICCV), 2023
Yushi Hu
Benlin Liu
Jungo Kasai
Yizhong Wang
Mari Ostendorf
Ranjay Krishna
Noah A. Smith
EGVM
386
364
0
21 Mar 2023
GNNFormer: A Graph-based Framework for Cytopathology Report Generation
Yangqiaoyu Zhou
Kai-Lang Yao
Wusuo Li
MedIm
188
1
0
17 Mar 2023
Lana: A Language-Capable Navigator for Instruction Following and Generation
Computer Vision and Pattern Recognition (CVPR), 2023
Xiaohan Wang
Wenguan Wang
Jiayi Shao
Yi Yang
LLMAG
LM&Ro
250
58
0
15 Mar 2023
PR-MCS: Perturbation Robust Metric for MultiLingual Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Yongil Kim
Yerin Hwang
Hyeongu Yun
Seunghyun Yoon
Trung Bui
Kyomin Jung
317
7
0
15 Mar 2023
FactReranker: Fact-guided Reranker for Faithful Radiology Report Summarization
Qianqian Xie
Jiayu Zhou
Yifan Peng
Fei Wang
HILM
MedIm
257
16
0
15 Mar 2023
ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Bang-ju Yang
Fenglin Liu
Yuexian Zou
Xian Wu
Yaowei Wang
David Clifton
308
14
0
11 Mar 2023
Learning Combinatorial Prompts for Universal Controllable Image Captioning
International Journal of Computer Vision (IJCV), 2023
Zhen Wang
Jun Xiao
Yueting Zhuang
Fei Gao
Jian Shao
Long Chen
209
12
0
11 Mar 2023
Refined Vision-Language Modeling for Fine-grained Multi-modal Pre-training
Lisai Zhang
Qingcai Chen
Zhijian Chen
Yunpeng Han
Zhonghua Li
Bo Zhao
VLM
204
1
0
09 Mar 2023
Interpretable Visual Question Answering Referring to Outside Knowledge
International Conference on Information Photonics (ICIP), 2023
He Zhu
Ren Togo
Takahiro Ogawa
Miki Haseyama
168
1
0
08 Mar 2023
Graph Neural Networks in Vision-Language Image Understanding: A Survey
The Visual Computer (TVC), 2023
Henry Senior
Greg Slabaugh
Shanxin Yuan
Luca Rossi
GNN
325
37
0
07 Mar 2023
Neighborhood Contrastive Transformer for Change Captioning
IEEE transactions on multimedia (IEEE TMM), 2023
Yunbin Tu
Liang Li
Li Su
Kelvin Lu
Qin Huang
ViT
195
29
0
06 Mar 2023
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
International Conference on Learning Representations (ICLR), 2023
Wei Li
Linchao Zhu
Longyin Wen
Yi Yang
VLM
230
124
0
06 Mar 2023
Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning
Pranav Dandwate
Chaitanya Shahane
V. Jagtap
Shridevi C. Karande
183
9
0
05 Mar 2023
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Computer Vision and Pattern Recognition (CVPR), 2023
Zequn Zeng
Hao Zhang
Zhengjue Wang
Ruiying Lu
Dongsheng Wang
Bo Chen
BDL
DiffM
244
59
0
04 Mar 2023
Language Is Not All You Need: Aligning Perception with Language Models
Neural Information Processing Systems (NeurIPS), 2023
Shaohan Huang
Li Dong
Wenhui Wang
Y. Hao
Saksham Singhal
...
Johan Bjorck
Vishrav Chaudhary
Subhojit Som
Xia Song
Furu Wei
VLM
LRM
MLLM
356
699
0
27 Feb 2023
Learning Visual Representations via Language-Guided Sampling
Computer Vision and Pattern Recognition (CVPR), 2023
Mohamed El Banani
Karan Desai
Justin Johnson
SSL
VLM
464
36
0
23 Feb 2023
Test-Time Distribution Normalization for Contrastively Learned Vision-language Models
Neural Information Processing Systems (NeurIPS), 2023
Yi Zhou
Juntao Ren
Fengyu Li
Ramin Zabih
Ser-Nam Lim
VLM
265
21
0
22 Feb 2023
Retrieval-augmented Image Captioning
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
R. Ramos
Desmond Elliott
Bruno Martins
VLM
199
44
0
16 Feb 2023
Towards Local Visual Modeling for Image Captioning
Pattern Recognition (Pattern Recogn.), 2023
Yiwei Ma
Jiayi Ji
Xiaoshuai Sun
Weihao Ye
Rongrong Ji
ViT
262
110
0
13 Feb 2023
Stacked Cross-modal Feature Consolidation Attention Networks for Image Captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Moghaddam
M. Shamsfard
197
4
0
08 Feb 2023
KENGIC: KEyword-driven and N-Gram Graph based Image Captioning
International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2022
Brandon Birmingham
A. Muscat
118
1
0
07 Feb 2023
DEVICE: Depth and Visual Concepts Aware Transformer for OCR-based Image Captioning
Pattern Recognition (Pattern Recogn.), 2023
Dongsheng Xu
Qingbao Huang
Shuang Feng
Yiru Cai
Feng Shuang
Yi Cai
ViT
VLM
563
1
0
03 Feb 2023
Style-Aware Contrastive Learning for Multi-Style Image Captioning
Findings (Findings), 2023
Yucheng Zhou
Guodong Long
164
28
0
26 Jan 2023
Semi-Supervised Image Captioning by Adversarially Propagating Labeled Data
IEEE Access (IEEE Access), 2023
Dong-Jin Kim
Tae-Hyun Oh
Jinsoo Choi
In So Kweon
SSL
VLM
162
10
0
26 Jan 2023
Towards a Unified Model for Generating Answers and Explanations in Visual Question Answering
Findings (Findings), 2023
Chenxi Whitehouse
Tillman Weyde
Pranava Madhyastha
LRM
292
4
0
25 Jan 2023
Visual Semantic Relatedness Dataset for Image Captioning
Ahmed Sabir
Francesc Moreno-Noguer
Lluís Padró
CoGe
VLM
221
4
0
20 Jan 2023
Embodied Agents for Efficient Exploration and Smart Scene Description
IEEE International Conference on Robotics and Automation (ICRA), 2023
Roberto Bigazzi
Marcella Cornia
S. Cascianelli
Lorenzo Baraldi
Rita Cucchiara
LM&Ro
185
10
0
17 Jan 2023
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Reza Azad
Amirhossein Kazerouni
Moein Heidari
Ehsan Khodapanah Aghdam
Amir Molaei
Yiwei Jia
Abin Jose
Rijo Roy
Dorit Merhof
MedIm
ViT
401
331
0
09 Jan 2023
Adaptively Clustering Neighbor Elements for Image-Text Generation
Zihua Wang
Xu Yang
Hanwang Zhang
Haiyang Xu
Mingshi Yan
Feisi Huang
Yu Zhang
VLM
556
0
0
05 Jan 2023
Do DALL-E and Flamingo Understand Each Other?
IEEE International Conference on Computer Vision (ICCV), 2022
Hang Li
Jindong Gu
Rajat Koner
Sahand Sharifzadeh
Volker Tresp
MLLM
231
14
0
23 Dec 2022
Benchmarking Spatial Relationships in Text-to-Image Generation
Tejas Gokhale
Hamid Palangi
Besmira Nushi
Vibhav Vineet
Eric Horvitz
Ece Kamar
Chitta Baral
Yezhou Yang
EGVM
389
88
0
20 Dec 2022
MetaCLUE: Towards Comprehensive Visual Metaphors Research
Computer Vision and Pattern Recognition (CVPR), 2022
Arjun Reddy Akula
Brenda S. Driscoll
P. Narayana
Soravit Changpinyo
Zhi-xuan Jia
...
Sugato Basu
Leonidas Guibas
William T. Freeman
Yuanzhen Li
Varun Jampani
CLIP
VLM
207
45
0
19 Dec 2022
Efficient Image Captioning for Edge Devices
AAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiangrong Xie
Hangzai Luo
Qinglin Cheng
Jihao Wu
Mingbo Jia
Linlin Li
VLM
CLIP
224
40
0
18 Dec 2022
Harnessing the Power of Multi-Task Pretraining for Ground-Truth Level Natural Language Explanations
Björn Plüster
Jakob Ambsdorf
Lukas Braach
Jae Hee Lee
S. Wermter
226
6
0
08 Dec 2022
Switching to Discriminative Image Captioning by Relieving a Bottleneck of Reinforcement Learning
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Ukyo Honda
Taro Watanabe
Yuji Matsumoto
242
13
0
06 Dec 2022
Semantic-Conditional Diffusion Networks for Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2022
Jianjie Luo
Yehao Li
Yingwei Pan
Ting Yao
Jianlin Feng
Hongyang Chao
Tao Mei
DiffM
248
116
0
06 Dec 2022
Towards Generating Diverse Audio Captions via Adversarial Training
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2022
Xinhao Mei
Xubo Liu
Jianyuan Sun
Mark D. Plumbley
Wenwu Wang
DiffM
308
6
0
05 Dec 2022
Controllable Image Captioning via Prompting
AAAI Conference on Artificial Intelligence (AAAI), 2022
Ning Wang
Jiahao Xie
Jihao Wu
Mingbo Jia
Linlin Li
264
39
0
04 Dec 2022
Uncertainty-Aware Image Captioning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zhengcong Fei
Mingyuan Fan
Li Zhu
Junshi Huang
Xiaoming Wei
Xiaolin K. Wei
UQLM
232
20
0
30 Nov 2022
CLID: Controlled-Length Image Descriptions with Limited Data
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2022
Elad Hirsch
A. Tal
VLM
3DV
293
6
0
27 Nov 2022
Aesthetically Relevant Image Captioning
AAAI Conference on Artificial Intelligence (AAAI), 2022
Zhipeng Zhong
Fei Zhou
Guoping Qiu
134
16
0
25 Nov 2022
Previous
1
2
3
...
8
9
10
...
19
20
21
Next
Page 9 of 21
Page
of 21
Go