SPICE: Semantic Propositional Image Caption Evaluation

29 July 2016

Papers citing "SPICE: Semantic Propositional Image Caption Evaluation"

50 / 1,002 papers shown

JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models

Yuiga Wada

Kanta Kaneda

Komei Sugiura

236

07 Nov 2023

Multitask Multimodal Prompted Training for Interactive Embodied Task CompletionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Georgios Pantazopoulos

Malvina Nikandrou

199

07 Nov 2023

LLM4Drive: A Survey of Large Language Models for Autonomous Driving

604

171

02 Nov 2023

CapsFusion: Rethinking Image-Text Data at ScaleComputer Vision and Pattern Recognition (CVPR), 2023

371

31 Oct 2023

Video-Helpful Multimodal Machine TranslationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Sadao Kurohashi

175

31 Oct 2023

Generating Context-Aware Natural Answers for Questions in 3D ScenesBritish Machine Vision Conference (BMVC), 2023

Mohammed Munzer Dwedari

Matthias Niessner

Dave Zhenyu Chen

203

30 Oct 2023

Are NLP Models Good at Tracing Thoughts: An Overview of Narrative UnderstandingConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Lixing Zhu

Runcong Zhao

Lin Gui

Yulan He

251

28 Oct 2023

An Early Evaluation of GPT-4V(ision)

193

25 Oct 2023

Evaluating, Understanding, and Improving Constrained Text Generation for Large Language Models

Xiang Chen

Xiaojun Wan

184

25 Oct 2023

Recent Advances in Multi-modal 3D Scene Understanding: A Comprehensive Survey and Evaluation

Peng Wang

276

24 Oct 2023

CLAIR: Evaluating Image Captions with Large Language Models

198

19 Oct 2023

Evaluating the Fairness of Discriminative Foundation Models in Computer VisionAAAI/ACM Conference on AI, Ethics, and Society (AIES), 2023

Junaid Ali

Matthäus Kleindessner

248

18 Oct 2023

Bounding and Filling: A Fast and Flexible Framework for Image Captioning

187

15 Oct 2023

Analyzing and Mitigating Object Hallucination in Large Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023

Mohit Bansal

369

268

01 Oct 2023

Self-supervised Cross-view Representation Reconstruction for Change CaptioningIEEE International Conference on Computer Vision (ICCV), 2023

Yunbin Tu

195

28 Sep 2023

Targeted Image Data Augmentation Increases Basic Skills Captioning RobustnessIEEE Games Entertainment Media Conference (IEEE GEM), 2023

Valentin Barriere

Felipe del Rio

Andres Carvallo De Ferari

Carlos Aspillaga

Eugenio Herrera-Berg

Cristian Buc Calderon

DiffM

233

27 Sep 2023

MindGPT: Interpreting What You See with Non-invasive Brain RecordingsIEEE Transactions on Image Processing (IEEE TIP), 2023

Jiaxuan Chen

Yu Qi

Yueming Wang

Gang Pan

267

27 Sep 2023

Weakly-supervised Automated Audio Captioning via text only training

Theodoros Kouzelis

Vassilis Katsouros

CLIP

235

21 Sep 2023

ContextRef: Evaluating Referenceless Metrics For Image Description GenerationInternational Conference on Learning Representations (ICLR), 2023

246

21 Sep 2023

Toward Unified Controllable Text Generation via Regular Expression InstructionInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

Xin Zheng

Hongyu Lin

Xianpei Han

Le Sun

223

19 Sep 2023

Predicate Classification Using Optimal Transport Loss in Scene Graph Generation

Sorachi Kurita

Satoshi Oyama

Itsuki Noda

170

19 Sep 2023

Synth-AC: Enhancing Audio Captioning with Synthetic Supervision

177

18 Sep 2023

Viewpoint Integration and Registration with Vision Language Foundation Model for Image Change Understanding

Fan Wang

152

15 Sep 2023

Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal TokensIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

Jeong Hun Yeo

203

15 Sep 2023

Learning to Predict Concept Ordering for Common Sense GenerationInternational Joint Conference on Natural Language Processing (IJCNLP), 2023

127

12 Sep 2023

Prefix-diffusion: A Lightweight Diffusion Model for Diverse Image CaptioningInternational Conference on Language Resources and Evaluation (LREC), 2023

Zhengcong Fei

262

10 Sep 2023

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical LearningComputer Vision and Pattern Recognition (CVPR), 2023

Qi Wu

204

05 Sep 2023

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

...

299

05 Sep 2023

DeViL: Decoding Vision features into Language

196

04 Sep 2023

CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding

Etienne Labbé

Thomas Pellegrini

J. Pinquier

288

01 Sep 2023

Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic GroundingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Joshua Forster Feinglass

Yezhou Yang

177

01 Sep 2023

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?

Etienne Labbé

Thomas Pellegrini

J. Pinquier

169

29 Aug 2023

Explaining Vision and Language through Graphs of Events in Space and Time

188

29 Aug 2023

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual CaptioningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Yaowei Wang

227

25 Aug 2023

With a Little Help from your own Past: Prototypical Memory Networks for Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2023

Lorenzo Baraldi

186

23 Aug 2023

CgT-GAN: CLIP-guided Text GAN for Image CaptioningACM Multimedia (ACM MM), 2023

229

23 Aug 2023

Audio Difference Captioning Utilizing Similarity-Discrepancy Disentanglement

221

23 Aug 2023

Explore and Tell: Embodied Visual Captioning in 3D EnvironmentsIEEE International Conference on Computer Vision (ICCV), 2023

Qin Jin

199

21 Aug 2023

Uni-NLX: Unifying Textual Explanations for Vision and Vision-Language Tasks

Fawaz Sammani

Nikos Deligiannis

187

17 Aug 2023

Informative Scene Graph Generation via DebiasingInternational Journal of Computer Vision (IJCV), 2023

Lianli Gao

Jingkuan Song

199

10 Aug 2023

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open WorldInternational Conference on Learning Representations (ICLR), 2023

...

Zhiguo Cao

Yu Qiao

270

118

03 Aug 2023

Transferable Decoding with Visual Entities for Zero-Shot Image CaptioningIEEE International Conference on Computer Vision (ICCV), 2023

Chengjie Wang

169

31 Jul 2023

Exploring Annotation-free Image Captioning with Retrieval-augmented Pseudo Sentence GenerationACM Multimedia Asia (MA), 2023

Zhiyuan Li

Dongnan Liu

Heng Wang

Chaoyi Zhang

Weidong (Tom) Cai

RALM

190

27 Jul 2023

Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training ModelsIEEE International Conference on Computer Vision (ICCV), 2023

273

121

26 Jul 2023

Kefa: A Knowledge Enhanced and Fine-grained Aligned Speaker for Navigation Instruction Generation

268

25 Jul 2023

Improving Multimodal Datasets with Image CaptioningNeural Information Processing Systems (NeurIPS), 2023

Thao Nguyen

263

125

19 Jul 2023

Open Scene Understanding: Grounded Situation Recognition Meets Segment Anything for Helping People with Visual Impairments

Kailun Yang

142

15 Jul 2023

Linear Alignment of Vision-language Models for Image Captioning

486

10 Jul 2023

Transformers in Healthcare: A Survey

...

267

30 Jun 2023

ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple OraclesNatural Language Processing and Chinese Computing (NLPCC), 2023

Haoqin Tu

Bowen Yang

Xianfeng Zhao

173

29 Jun 2023