v1v2v3 (latest)

CLIPScore: A Reference-free Evaluation Metric for Image Captioning

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021

18 April 2021

Yejin Choi

Papers citing "CLIPScore: A Reference-free Evaluation Metric for Image Captioning"

50 / 1,489 papers shown

Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks

Hongcheng Gao

Hao Zhang

Yinpeng Dong

Zhijie Deng

AAML

305

16 Jun 2023

Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional UnderstandingComputer Vision and Pattern Recognition (CVPR), 2023

418

15 Jun 2023

Pragmatic Inference with a CLIP Listener for Contrastive CaptioningAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Jiefu Ou

Benno Krojer

Daniel Fried

272

15 Jun 2023

Extending CLIP's Image-Text Alignment to Referring Image SegmentationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

309

14 Jun 2023

Scalable 3D Captioning with Pretrained ModelsNeural Information Processing Systems (NeurIPS), 2023

311

213

12 Jun 2023

Boosting GUI Prototyping with Diffusion ModelsIEEE International Requirements Engineering Conference (RE), 2023

170

09 Jun 2023

Grounded Text-to-Image Synthesis with Attention RefocusingComputer Vision and Pattern Recognition (CVPR), 2023

414

157

08 Jun 2023

SyncDiffusion: Coherent Montage via Synchronized Joint DiffusionsNeural Information Processing Systems (NeurIPS), 2023

389

08 Jun 2023

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2023

Yezhou Yang

338

07 Jun 2023

AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment

Chunyi Li

Zicheng Zhang

Haoning Wu

Wei Sun

Xiongkuo Min

Xiaohong Liu

Guangtao Zhai

Weisi Lin

EGVM

259

195

07 Jun 2023

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion ModelsAAAI Conference on Artificial Intelligence (AAAI), 2023

Yezhou Yang

269

07 Jun 2023

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewardsNeural Information Processing Systems (NeurIPS), 2023

367

203

07 Jun 2023

Multi-modal Latent DiffusionEntropy (Entropy), 2023

264

07 Jun 2023

HeadSculpt: Crafting 3D Head Avatars with TextNeural Information Processing Systems (NeurIPS), 2023

203

05 Jun 2023

Revisiting the Role of Language Priors in Vision-Language ModelsInternational Conference on Machine Learning (ICML), 2023

470

02 Jun 2023

ReFACT: Updating Text-to-Image Models by Editing the Text EncoderNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

348

01 Jun 2023

Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images

Peyman Gholami

R. Xiao

DiffM

257

31 May 2023

Understanding and Mitigating Copying in Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

273

201

31 May 2023

RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignmentIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

Guian Fang

Zutao Jiang

Jianhua Han

Guangsong Lu

Hang Xu

Shengcai Liao

Xiaodan Liang

EGVM

171

31 May 2023

DisCLIP: Open-Vocabulary Referring Expression GenerationBritish Machine Vision Conference (BMVC), 2023

261

30 May 2023

Nested Diffusion Processes for Anytime Image GenerationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

262

30 May 2023

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

214

30 May 2023

TaleCrafter: Interactive Story Visualization with Multiple CharactersACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023

Xiaodong Cun

...

Yong Zhang

Ying Shan

Yujiu Yang

DiffM

351

29 May 2023

InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions

249

29 May 2023

Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language ModelsInternational Conference on Learning Representations (ICLR), 2023

322

29 May 2023

Conditional Score Guidance for Text-Driven Image-to-Image TranslationNeural Information Processing Systems (NeurIPS), 2023

188

29 May 2023

Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

Jia-Bin Huang

Yi Ren

Rongjie Huang

Dongchao Yang

Xiang Yin

Zhou Zhao

221

29 May 2023

FuseCap: Leveraging Large Language Models for Enriched Fused Image CaptionsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

393

28 May 2023

FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph ParsingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

327

27 May 2023

Towards Consistent Video Editing with Text-to-Image Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Zicheng Zhang

Bonan li

Xuecheng Nie

Congying Han

Tiande Guo

Luoqi Liu

DiffM

136

27 May 2023

Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion InferenceAAAI Conference on Artificial Intelligence (AAAI), 2023

312

27 May 2023

MPCHAT: Towards Multimodal Persona-Grounded ConversationAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

180

27 May 2023

S4M: Generating Radiology Reports by A Single Model for Multiple Body PartsAsian Conference on Computer Vision (ACCV), 2023

Qi Wu

155

26 May 2023

Are Diffusion Models Vision-And-Language Reasoners?Neural Information Processing Systems (NeurIPS), 2023

Siva Reddy

500

25 May 2023

Parallel Sampling of Diffusion ModelsNeural Information Processing Systems (NeurIPS), 2023

Dorsa Sadigh

438

100

25 May 2023

GenerateCT: Text-Conditional Generation of 3D Chest CT VolumesEuropean Conference on Computer Vision (ECCV), 2023

Ibrahim Ethem Hamamci

...

392

25 May 2023

Weakly Supervised Vision-and-Language Pre-training with Relative RepresentationsAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Chi Chen

Peng Li

Maosong Sun

Yang Liu

152

24 May 2023

MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image GenerationNeural Information Processing Systems (NeurIPS), 2023

...

Andres Felipe Cruz Salinas

P. Schramowski

Kristian Kersting

Samuel Weinbach

376

24 May 2023

Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying ReferencesNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023

162

24 May 2023

Transferring Visual Attributes from Natural Language to Verified Image Generation

194

24 May 2023

An Examination of the Robustness of Reference-Free Image Captioning Evaluation MetricsFindings (Findings), 2023

Saba Ahmadi

Aishwarya Agrawal

232

24 May 2023

Gender Biases in Automatic Evaluation Metrics for Image CaptioningConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

413

24 May 2023

Text-guided 3D Human Generation from 2D CollectionsConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Jingyu Liu

175

23 May 2023

CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language ModelIEEE Transactions on Image Processing (IEEE TIP), 2023

381

23 May 2023

If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection

236

22 May 2023

The CLIP Model is Secretly an Image-to-Prompt ConverterNeural Information Processing Systems (NeurIPS), 2023

Yuxuan Ding

Chunna Tian

Haoxuan Ding

Lingqiao Liu

DiffM

150

22 May 2023

Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model

Jie Yang

Bing Li

Fengyu Yang

Ailing Zeng

Lei Zhang

Ruimao Zhang

VLM DiffM

275

20 May 2023

Movie101: A New Movie Understanding BenchmarkAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Qin Jin

278

20 May 2023

LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis

479

19 May 2023

LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis EvaluationNeural Information Processing Systems (NeurIPS), 2023

431

18 May 2023