Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1607.08822
Cited By
SPICE: Semantic Propositional Image Caption Evaluation
29 July 2016
Peter Anderson
Basura Fernando
Mark Johnson
Stephen Gould
EGVM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"SPICE: Semantic Propositional Image Caption Evaluation"
50 / 1,002 papers shown
Self-Supervised Image Captioning with CLIP
Chuanyang Jin
VLM
SSL
210
3
0
26 Jun 2023
Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
International Conference on Learning Representations (ICLR), 2023
Fuxiao Liu
Kevin Qinghong Lin
Linjie Li
Jianfeng Wang
Yaser Yacoob
Lijuan Wang
VLM
MLLM
454
412
0
26 Jun 2023
Improving Reference-based Distinctive Image Captioning with Contrastive Rewards
Yangjun Mao
Jun Xiao
Dong Zhang
Meng Cao
Jian Shao
Yueting Zhuang
Long Chen
EGVM
211
10
0
25 Jun 2023
An overview on the evaluated video retrieval tasks at TRECVID 2022
G. Awad
Keith Curtis
A. Butt
Jonathan G. Fiscus
A. Godil
...
Eliot Godard
Lukas L. Diduch
Jeffrey Liu
Yvette Graham
Georges Quénot
89
13
0
22 Jun 2023
SituatedGen: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning
Neural Information Processing Systems (NeurIPS), 2023
Yunxiang Zhang
Xiaojun Wan
AILaw
LRM
232
9
0
21 Jun 2023
Learning to Generate Better Than Your LLM
Jonathan D. Chang
Kianté Brantley
Rajkumar Ramamurthy
Dipendra Kumar Misra
Wen Sun
273
54
0
20 Jun 2023
Improving Image Captioning Descriptiveness by Ranking and LLM-based Fusion
Luigi Celona
Simone Bianco
Marco Donzella
Paolo Napoletano
268
26
0
20 Jun 2023
Improving Audio Caption Fluency with Automatic Error Correction
Hanxue Zhang
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
140
0
0
16 Jun 2023
Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Shih-Lun Wu
Yi-Hui Chou
Liang Li
152
0
0
16 Jun 2023
Top-Down Framework for Weakly-supervised Grounded Image Captioning
Chen Cai
Suchen Wang
Kim-Hui Yap
Yi Wang
ObjD
235
5
0
13 Jun 2023
Embodied Executable Policy Learning with Language-based Scene Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Jielin Qiu
Mengdi Xu
William Jongwon Han
Seungwhan Moon
Ding Zhao
LM&Ro
156
9
0
09 Jun 2023
Towards Adaptable and Interactive Image Captioning with Data Augmentation and Episodic Memory
Aliki Anagnostopoulou
Mareike Hartmann
Daniel Sonntag
CLL
VLM
192
1
0
06 Jun 2023
SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning
Zhishen Yang
Mary Dabre
Hideki Tanaka
Naoaki Okazaki
346
24
0
06 Jun 2023
Enhance Temporal Relations in Audio Captioning with Sound Event Detection
Interspeech (Interspeech), 2023
Zeyu Xie
Xuenan Xu
Mengyue Wu
K. Yu
235
16
0
02 Jun 2023
Adapting a ConvNeXt model to audio classification on AudioSet
Interspeech (Interspeech), 2023
Thomas Pellegrini
Ismail Khalfaoui-Hassani
Etienne Labbé
T. Masquelier
191
29
0
01 Jun 2023
CapText: Large Language Model-based Caption Generation From Image Context and Description
Shinjini Ghosh
Sagnik Anupam
VLM
329
4
0
01 Jun 2023
DisCLIP: Open-Vocabulary Referring Expression Generation
British Machine Vision Conference (BMVC), 2023
Lior Bracha
E. Shaar
Aviv Shamsian
Ethan Fetaya
Gal Chechik
ObjD
269
9
0
30 May 2023
Dual Transformer Decoder based Features Fusion Network for Automated Audio Captioning
Interspeech (Interspeech), 2023
Jianyuan Sun
Xubo Liu
Xinhao Mei
V. Kılıç
Mark D. Plumbley
Wenwu Wang
168
5
0
30 May 2023
FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhuang Li
Yuyang Chai
Terry Yue Zhuo
Zhuang Li
Gholamreza Haffari
Fei Li
Donghong Ji
Quan Hung Tran
328
51
0
27 May 2023
Learning to Imagine: Visually-Augmented Natural Language Generation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Tianyi Tang
Yushuo Chen
Yifan Du
Junyi Li
Wayne Xin Zhao
Ji-Rong Wen
DiffM
428
10
0
26 May 2023
Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2023
Nicola Messina
J. Sedmidubský
Fabrizio Falchi
Tomávs Rebok
EGVM
231
19
0
25 May 2023
Visual Programming for Text-to-Image Generation and Evaluation
Jaemin Cho
Abhaysinh Zala
Joey Tianyi Zhou
MLLM
390
55
0
24 May 2023
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Tianyi Tang
Hongyuan Lu
Yuchen Eleanor Jiang
Haoyang Huang
Dongdong Zhang
Wayne Xin Zhao
Tom Kocmi
Furu Wei
165
7
0
24 May 2023
#REVAL: a semantic evaluation framework for hashtag recommendation
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2023
Areej Alsini
D. Huynh
A. Datta
92
0
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
416
22
0
24 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Goran Frehse
Zeynep Akata
237
33
0
22 May 2023
GEST: the Graph of Events in Space and Time as a Common Representation between Vision and Language
Mihai Masala
Nicolae Cudlenco
Traian Rebedea
Marius Leordeanu
193
0
0
22 May 2023
A request for clarity over the End of Sequence token in the Self-Critical Sequence Training
International Conference on Image Analysis and Processing (ICIAP), 2023
J. Hu
Roberto Cavicchioli
Alessandro Capotondi
271
7
0
20 May 2023
What Makes for Good Visual Tokenizers for Large Language Models?
Guangzhi Wang
Yixiao Ge
Xiaohan Ding
Mohan S. Kankanhalli
Ying Shan
MLLM
VLM
291
46
0
20 May 2023
DiffCap: Exploring Continuous Diffusion on Image Captioning
Yufeng He
Zefan Cai
Xu Gan
Baobao Chang
DiffM
205
11
0
20 May 2023
PASTS: Progress-Aware Spatio-Temporal Transformer Speaker For Vision-and-Language Navigation
Engineering applications of artificial intelligence (Eng. Appl. Artif. Intell.), 2023
Liuyi Wang
Chengju Liu
Zongtao He
Shu Li
Qingqing Yan
Huiyi Chen
Qi Chen
216
13
0
19 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Neural Information Processing Systems (NeurIPS), 2023
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
431
99
0
18 May 2023
Listen, Think, and Understand
International Conference on Learning Representations (ICLR), 2023
Yuan Gong
Hongyin Luo
Alexander H. Liu
Leonid Karlinsky
James R. Glass
ELM
MLLM
LRM
699
221
0
18 May 2023
Foundations of Spatial Perception for Robotics: Hierarchical Representations and Real-time Systems
Nathan Hughes
Yun Chang
Siyi Hu
Rajat Talak
Rumaisa Abdulhai
Jared Strader
Luca Carlone
264
80
0
11 May 2023
Simple Token-Level Confidence Improves Caption Correctness
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Suzanne Petryk
Spencer Whitehead
Joseph E. Gonzalez
Trevor Darrell
Anna Rohrbach
Marcus Rohrbach
245
10
0
11 May 2023
InfoMetIC: An Informative Metric for Reference-free Image Caption Evaluation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Anwen Hu
Shizhe Chen
Liang Zhang
Qin Jin
237
28
0
10 May 2023
Transforming Visual Scene Graphs to Image Captions
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Xu Yang
Jiawei Peng
Zihua Wang
Haiyang Xu
Qinghao Ye
Chenliang Li
Mingshi Yan
Feisi Huang
Zhangzikang Li
Yu Zhang
372
25
0
03 May 2023
Diverse and Vivid Sound Generation from Text Descriptions
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023
Guangwei Li
Xuenan Xu
Lingfeng Dai
Mengyue Wu
K. Yu
191
5
0
03 May 2023
Visual Transformation Telling
Wanqing Cui
Mustafa Nasir-Moin
Yanyan Lan
Viola J. Chen
Jiafeng Guo
Xueqi Cheng
LRM
256
4
0
03 May 2023
Multimodal Data Augmentation for Image Captioning using Diffusion Models
Changrong Xiao
S. Xu
Kunpeng Zhang
DiffM
207
16
0
03 May 2023
Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer
European Signal Processing Conference (EUSIPCO), 2023
Etienne Labbé
J. Pinquier
Thomas Pellegrini
211
5
0
02 May 2023
VPGTrans: Transfer Visual Prompt Generator across LLMs
Neural Information Processing Systems (NeurIPS), 2023
Ao Zhang
Hao Fei
Yuan Yao
Wei Ji
Li Li
Zhiyuan Liu
Tat-Seng Chua
MLLM
VLM
211
101
0
02 May 2023
Quality-agnostic Image Captioning to Safely Assist People with Vision Impairment
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Lu Yu
Malvina Nikandrou
Jiali Jin
Verena Rieser
158
6
0
28 Apr 2023
From Association to Generation: Text-only Captioning by Unsupervised Cross-modal Mapping
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Junyan Wang
Ming Yan
Yi Zhang
Jitao Sang
CLIP
VLM
301
17
0
26 Apr 2023
A Review of Deep Learning for Video Captioning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Moloud Abdar
Meenakshi Kollati
Swaraja Kuraparthi
Farhad Pourpanah
Daniel J. McDuff
...
Shuicheng Yan
Abduallah A. Mohamed
Abbas Khosravi
Xiaoshi Zhong
Fatih Porikli
3DV
226
41
0
22 Apr 2023
VALOR: Vision-Audio-Language Omni-Perception Pretraining Model and Dataset
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Sihan Chen
Xingjian He
Longteng Guo
Xinxin Zhu
Weining Wang
Jinhui Tang
Jinhui Tang
VLM
401
154
0
17 Apr 2023
Tractable Control for Autoregressive Language Generation
International Conference on Machine Learning (ICML), 2023
Honghua Zhang
Meihua Dang
Nanyun Peng
Karen Ullrich
BDL
436
58
0
15 Apr 2023
A-CAP: Anticipation Captioning with Commonsense Knowledge
Computer Vision and Pattern Recognition (CVPR), 2023
D. Vo
Quoc-An Luong
Akihiro Sugimoto
Hideki Nakayama
161
3
0
13 Apr 2023
Model-Agnostic Gender Debiased Image Captioning
Computer Vision and Pattern Recognition (CVPR), 2023
Yusuke Hirota
Yuta Nakashima
Noa Garcia
FaML
339
23
0
07 Apr 2023
Graph Attention for Automated Audio Captioning
IEEE Signal Processing Letters (IEEE SPL), 2023
Feiyang Xiao
Jian Guan
Qiaoxi Zhu
Wenwu Wang
209
11
0
07 Apr 2023
Previous
1
2
3
...
7
8
9
...
19
20
21
Next
Page 8 of 21
Page
of 21
Go