v1v2v3 (latest)

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021

18 April 2021

Yejin Choi

Papers citing "CLIPScore: A Reference-free Evaluation Metric for Image Captioning"

39 / 1,489 papers shown

Linearly Mapping from Image to Text SpaceInternational Conference on Learning Representations (ICLR), 2022

1.2K

145

30 Sep 2022

GAMA: Generative Adversarial Multi-Object Scene AttacksNeural Information Processing Systems (NeurIPS), 2022

Amit K. Roy-Chowdhury

AAML

307

20 Sep 2022

Learning Distinct and Representative Styles for Image CaptioningNeural Information Processing Systems (NeurIPS), 2022

Qi Chen

Chaorui Deng

Qi Wu

VLM

187

17 Sep 2022

Distribution Aware Metrics for Conditional Natural Language GenerationInternational Conference on Language Resources and Evaluation (LREC), 2022

David M. Chan

Yiming Ni

David A. Ross

Sudheendra Vijayanarasimhan

Austin Myers

John F. Canny

363

15 Sep 2022

Every picture tells a story: Image-grounded controllable stylistic story generation

218

04 Sep 2022

Frido: Feature Pyramid Diffusion for Complex Scene Image SynthesisAAAI Conference on Artificial Intelligence (AAAI), 2022

Lu Yuan

252

114

29 Aug 2022

Deepfake: Definitions, Performance Metrics and Standards, Datasets and Benchmarks, and a Meta-Review

Enes ALTUNCU

V. N. Franqueira

Shujun Li

344

21 Aug 2022

ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion DesignACM Multimedia (ACM MM), 2022

Xujie Zhang

Yuyang Sha

Michael C. Kampffmeyer

Xiaodan Liang

184

11 Aug 2022

A Sketch Is Worth a Thousand Words: Image Retrieval with Text and SketchEuropean Conference on Computer Vision (ECCV), 2022

Diyi Yang

190

05 Aug 2022

Exploring CLIP for Assessing the Look and Feel of ImagesAAAI Conference on Artificial Intelligence (AAAI), 2022

430

977

25 Jul 2022

Zero-Shot Video Captioning with Evolving Pseudo-Tokens

Lior Wolf

233

22 Jul 2022

GRIT: Faster and Better Image captioning Transformer Using Dual Visual FeaturesEuropean Conference on Computer Vision (ECCV), 2022

218

148

20 Jul 2022

Are metrics measuring what they should? An evaluation of image captioning task metricsSignal processing. Image communication (SPIC), 2022

Othón González-Chávez

Guillermo Ruiz

Daniela Moctezuma

Tania A. Ramirez-delreal

226

04 Jul 2022

An Empirical Survey on Long Document Summarization: Datasets, Models and MetricsACM Computing Surveys (ACM CSUR), 2022

265

158

03 Jul 2022

Personalized Showcases: Generating Multi-Modal Explanations for RecommendationsAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2022

265

30 Jun 2022

Discrete Contrastive Diffusion for Cross-Modal Music and Image GenerationInternational Conference on Learning Representations (ICLR), 2022

Yan Yan

383

15 Jun 2022

Fine-grained Image Captioning with CLIP Reward

391

26 May 2022

Mutual Information Divergence: A Unified Metric for Multimodal Generative ModelsNeural Information Processing Systems (NeurIPS), 2022

348

25 May 2022

Photorealistic Text-to-Image Diffusion Models with Deep Language UnderstandingNeural Information Processing Systems (NeurIPS), 2022

...

Raphael Gontijo-Lopes

David J Fleet

1.2K

7,527

23 May 2022

Context Matters for Image Descriptions for Accessibility: Challenges for Referenceless Evaluation MetricsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Meredith Ringel Morris

Christopher Potts

246

21 May 2022

RankGen: Improving Text Generation with Large Ranking ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

335

19 May 2022

Language Models Can See: Plugging Visual Controls in Text Generation

Lingpeng Kong

274

111

05 May 2022

QRelScore: Better Evaluating Generated Questions with Deeper Understanding of Context-aware RelevanceConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

Xiaoqiang Wang

Bang Liu

Siliang Tang

Lingfei Wu

205

29 Apr 2022

Repro: An Open-Source Library for Improving the Reproducibility and Usability of Publicly Available Research Code

Daniel Deutsch

Dan Roth

AI4CE

245

29 Apr 2022

Video Captioning: a comparative review of where we are and which could be the routeComputer Vision and Image Understanding (CVIU), 2022

Daniela Moctezuma

Tania A. Ramirez-delreal

Guillermo Ruiz

Othón González-Chávez

218

12 Apr 2022

How does fake news use a thumbnail? CLIP-based Multimodal Detection on the Unrepresentative News Image

157

12 Apr 2022

DT2I: Dense Text-to-Image Generation from Region DescriptionsInternational Conference on Artificial Neural Networks (ICANN), 2022

172

05 Apr 2022

Multi-Modal Knowledge Graph Construction and Application: A SurveyIEEE Transactions on Knowledge and Data Engineering (TKDE), 2022

Zhixu Li

211

237

11 Feb 2022

Injecting Semantic Concepts into End-to-End Image Captioning

Xiaowei Hu

Yezhou Yang

Zicheng Liu

ViT VLM

243

122

09 Dec 2021

Bidimensional Leaderboards: Generate and Evaluate Language Hand in Hand

Keisuke Sakaguchi

Jacob Morrison

Yejin Choi

239

08 Dec 2021

Extract Free Dense Labels from CLIP

603

651

02 Dec 2021

ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic ArithmeticComputer Vision and Pattern Recognition (CVPR), 2021

Lior Wolf

335

236

29 Nov 2021

Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets

Marcella Cornia

Lorenzo Baraldi

G. Fiameni

Rita Cucchiara

321

24 Nov 2021

Transparent Human Evaluation for Image Captioning

Keisuke Sakaguchi

Jacob Morrison

Yejin Choi

188

17 Nov 2021

EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching

Bing Li

262

17 Nov 2021

Unifying Multimodal Transformer for Bi-directional Image and Text Generation

218

19 Oct 2021

From Show to Tell: A Survey on Deep Learning-based Image CaptioningIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021

Lorenzo Baraldi

437

348

14 Jul 2021

ImaginE: An Imagination-Based Automatic Evaluation Metric for Natural Language GenerationFindings (Findings), 2021

147

10 Jun 2021

Concadia: Towards Image-Based Text Generation with a PurposeConference on Empirical Methods in Natural Language Processing (EMNLP), 2021

267

16 Apr 2021