Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2104.08718
Cited By
v1
v2
v3 (latest)
CLIPScore: A Reference-free Evaluation Metric for Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2021
18 April 2021
Jack Hessel
Ari Holtzman
Maxwell Forbes
Ronan Le Bras
Yejin Choi
CLIP
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CLIPScore: A Reference-free Evaluation Metric for Image Captioning"
50 / 1,489 papers shown
Evaluating the Robustness of Text-to-image Diffusion Models against Real-world Attacks
Hongcheng Gao
Hao Zhang
Yinpeng Dong
Zhijie Deng
AAML
305
25
0
16 Jun 2023
Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Compositional Understanding
Computer Vision and Pattern Recognition (CVPR), 2023
Le Zhang
Rabiul Awal
Aishwarya Agrawal
CoGe
VLM
418
27
0
15 Jun 2023
Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jiefu Ou
Benno Krojer
Daniel Fried
272
6
0
15 Jun 2023
Extending CLIP's Image-Text Alignment to Referring Image Segmentation
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Seoyeon Kim
Minguk Kang
Dongwon Kim
Jaesik Park
Suha Kwak
VLM
309
21
0
14 Jun 2023
Scalable 3D Captioning with Pretrained Models
Neural Information Processing Systems (NeurIPS), 2023
Tiange Luo
C. Rockwell
Honglak Lee
Justin Johnson
311
213
0
12 Jun 2023
Boosting GUI Prototyping with Diffusion Models
IEEE International Requirements Engineering Conference (RE), 2023
Jialiang Wei
A. Courbis
Thomas Lambolais
Binbin Xu
P. Bernard
Gérard Dray
DiffM
170
25
0
09 Jun 2023
Grounded Text-to-Image Synthesis with Attention Refocusing
Computer Vision and Pattern Recognition (CVPR), 2023
Quynh Phung
Songwei Ge
Jia-Bin Huang
DiffM
414
157
0
08 Jun 2023
SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions
Neural Information Processing Systems (NeurIPS), 2023
Yuseung Lee
Kunho Kim
Hyunjin Kim
Minhyuk Sung
DiffM
389
91
0
08 Jun 2023
WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2023
Changhoon Kim
Kyle Min
Maitreya Patel
Sheng Cheng
Yezhou Yang
WIGM
338
53
0
07 Jun 2023
AGIQA-3K: An Open Database for AI-Generated Image Quality Assessment
Chunyi Li
Zicheng Zhang
Haoning Wu
Wei Sun
Xiongkuo Min
Xiaohong Liu
Guangtao Zhai
Weisi Lin
EGVM
259
195
0
07 Jun 2023
ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models
AAAI Conference on Artificial Intelligence (AAAI), 2023
Maitreya Patel
Tejas Gokhale
Chitta Baral
Yezhou Yang
CoGe
269
16
0
07 Jun 2023
Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Neural Information Processing Systems (NeurIPS), 2023
Alexandre Ramé
Guillaume Couairon
Mustafa Shukor
Corentin Dancette
Jean-Baptiste Gaya
Laure Soulier
Matthieu Cord
MoMe
367
203
0
07 Jun 2023
Multi-modal Latent Diffusion
Entropy (Entropy), 2023
Mustapha Bounoua
Giulio Franzese
Pietro Michiardi
DiffM
264
16
0
07 Jun 2023
HeadSculpt: Crafting 3D Head Avatars with Text
Neural Information Processing Systems (NeurIPS), 2023
Xiaoping Han
Yukang Cao
Kai Han
Xiatian Zhu
Jiankang Deng
Yi-Zhe Song
Tao Xiang
Kwan-Yee K. Wong
DiffM
203
61
0
05 Jun 2023
Revisiting the Role of Language Priors in Vision-Language Models
International Conference on Machine Learning (ICML), 2023
Zhiqiu Lin
Xinyue Chen
Deepak Pathak
Pengchuan Zhang
Deva Ramanan
VLM
470
38
0
02 Jun 2023
ReFACT: Updating Text-to-Image Models by Editing the Text Encoder
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Dana Arad
Hadas Orgad
Yonatan Belinkov
KELM
348
29
0
01 Jun 2023
Diffusion Brush: A Latent Diffusion Model-based Editing Tool for AI-generated Images
Peyman Gholami
R. Xiao
DiffM
257
3
0
31 May 2023
Understanding and Mitigating Copying in Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Gowthami Somepalli
Vasu Singla
Micah Goldblum
Jonas Geiping
Tom Goldstein
DiffM
273
201
0
31 May 2023
RealignDiff: Boosting Text-to-Image Diffusion Model with Coarse-to-fine Semantic Re-alignment
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
Guian Fang
Zutao Jiang
Jianhua Han
Guangsong Lu
Hang Xu
Shengcai Liao
Xiaodan Liang
EGVM
171
2
0
31 May 2023
DisCLIP: Open-Vocabulary Referring Expression Generation
British Machine Vision Conference (BMVC), 2023
Lior Bracha
E. Shaar
Aviv Shamsian
Ethan Fetaya
Gal Chechik
ObjD
261
9
0
30 May 2023
Nested Diffusion Processes for Anytime Image Generation
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Noam Elata
Bahjat Kawar
T. Michaeli
Michael Elad
DiffM
262
7
0
30 May 2023
LayerDiffusion: Layered Controlled Image Editing with Diffusion Models
Pengzhi Li
Qinxuan Huang
Yikang Ding
Zhiheng Li
DiffM
214
43
0
30 May 2023
TaleCrafter: Interactive Story Visualization with Multiple Characters
ACM SIGGRAPH Conference and Exhibition on Computer Graphics and Interactive Techniques in Asia (SIGGRAPH Asia), 2023
Yuan Gong
Youxin Pang
Xiaodong Cun
Menghan Xia
Yingqing He
...
Longyue Wang
Yong Zhang
Xintao Wang
Ying Shan
Yujiu Yang
DiffM
351
65
0
29 May 2023
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
Qian Wang
Biao Zhang
Michael Birsak
Peter Wonka
DiffM
249
56
0
29 May 2023
Test-Time Adaptation with CLIP Reward for Zero-Shot Generalization in Vision-Language Models
International Conference on Learning Representations (ICLR), 2023
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
VLM
322
39
0
29 May 2023
Conditional Score Guidance for Text-Driven Image-to-Image Translation
Neural Information Processing Systems (NeurIPS), 2023
Hyunsoo Lee
Minsoo Kang
Bohyung Han
DiffM
188
19
0
29 May 2023
Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation
Jia-Bin Huang
Yi Ren
Rongjie Huang
Dongchao Yang
Zhenhui Ye
Chen Zhang
Jinglin Liu
Xiang Yin
Zejun Ma
Zhou Zhao
DiffM
221
98
0
29 May 2023
FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Noam Rotstein
David Bensaid
Shaked Brody
Roy Ganz
Ron Kimmel
VLM
393
53
0
28 May 2023
FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zhuang Li
Yuyang Chai
Terry Yue Zhuo
Zhuang Li
Gholamreza Haffari
Fei Li
Donghong Ji
Quan Hung Tran
327
51
0
27 May 2023
Towards Consistent Video Editing with Text-to-Image Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Zicheng Zhang
Bonan li
Xuecheng Nie
Congying Han
Tiande Guo
Luoqi Liu
DiffM
136
43
0
27 May 2023
Accelerating Text-to-Image Editing via Cache-Enabled Sparse Diffusion Inference
AAAI Conference on Artificial Intelligence (AAAI), 2023
Zihao Yu
Haoyang Li
Fangcheng Fu
Xupeng Miao
Tengjiao Wang
DiffM
312
16
0
27 May 2023
MPCHAT: Towards Multimodal Persona-Grounded Conversation
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Jaewoo Ahn
Yeda Song
Sangdoo Yun
Gunhee Kim
180
26
0
27 May 2023
S4M: Generating Radiology Reports by A Single Model for Multiple Body Parts
Asian Conference on Computer Vision (ACCV), 2023
Qi Chen
Yutong Xie
Biao Wu
Minh-Son To
James Ang
Qi Wu
155
2
0
26 May 2023
Are Diffusion Models Vision-And-Language Reasoners?
Neural Information Processing Systems (NeurIPS), 2023
Benno Krojer
Elinor Poole-Dayan
Vikram S. Voleti
Christopher Pal
Siva Reddy
500
17
0
25 May 2023
Parallel Sampling of Diffusion Models
Neural Information Processing Systems (NeurIPS), 2023
Andy Shih
Suneel Belkhale
Stefano Ermon
Dorsa Sadigh
Nima Anari
DiffM
438
100
0
25 May 2023
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
European Conference on Computer Vision (ECCV), 2023
Ibrahim Ethem Hamamci
Sezgin Er
Anjany Sekuboyina
Enis Simsar
A. Tezcan
...
Hadrien Reynaud
Sarthak Pati
Christian Bluethgen
M. K. Özdemir
Bjoern Menze
DiffM
MedIm
392
52
0
25 May 2023
Weakly Supervised Vision-and-Language Pre-training with Relative Representations
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Chi Chen
Peng Li
Maosong Sun
Yang Liu
152
2
0
24 May 2023
MultiFusion: Fusing Pre-Trained Models for Multi-Lingual, Multi-Modal Image Generation
Neural Information Processing Systems (NeurIPS), 2023
Marco Bellagente
Manuel Brack
H. Teufel
Felix Friedrich
Bjorn Deiseroth
...
Koen Oostermeijer
Andres Felipe Cruz Salinas
P. Schramowski
Kristian Kersting
Samuel Weinbach
376
28
0
24 May 2023
Not All Metrics Are Guilty: Improving NLG Evaluation by Diversifying References
North American Chapter of the Association for Computational Linguistics (NAACL), 2023
Tianyi Tang
Hongyuan Lu
Yuchen Eleanor Jiang
Haoyang Huang
Dongdong Zhang
Wayne Xin Zhao
Tom Kocmi
Furu Wei
162
7
0
24 May 2023
Transferring Visual Attributes from Natural Language to Verified Image Generation
Rodrigo Valerio
João Bordalo
Michal Yarom
Yonattan Bitton
Idan Szpektor
João Magalhães
194
5
0
24 May 2023
An Examination of the Robustness of Reference-Free Image Captioning Evaluation Metrics
Findings (Findings), 2023
Saba Ahmadi
Aishwarya Agrawal
232
14
0
24 May 2023
Gender Biases in Automatic Evaluation Metrics for Image Captioning
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Haoyi Qiu
Zi-Yi Dou
Tianlu Wang
Asli Celikyilmaz
Nanyun Peng
EGVM
413
22
0
24 May 2023
Text-guided 3D Human Generation from 2D Collections
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Tsu-Jui Fu
Wenhan Xiong
Yixin Nie
Jingyu Liu
Barlas Ouguz
William Yang Wang
175
3
0
23 May 2023
CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model
IEEE Transactions on Image Processing (IEEE TIP), 2023
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yezhou Yang
CLIP
VLM
381
45
0
23 May 2023
If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection
Shyamgopal Karthik
Karsten Roth
Goran Frehse
Zeynep Akata
236
33
0
22 May 2023
The CLIP Model is Secretly an Image-to-Prompt Converter
Neural Information Processing Systems (NeurIPS), 2023
Yuxuan Ding
Chunna Tian
Haoxuan Ding
Lingqiao Liu
DiffM
150
17
0
22 May 2023
Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model
Jie Yang
Bing Li
Fengyu Yang
Ailing Zeng
Lei Zhang
Ruimao Zhang
VLM
DiffM
275
31
0
20 May 2023
Movie101: A New Movie Understanding Benchmark
Annual Meeting of the Association for Computational Linguistics (ACL), 2023
Zihao Yue
Tao Gui
Anwen Hu
Liang Zhang
Ziheng Wang
Qin Jin
VGen
278
25
0
20 May 2023
LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis
Yu Xie
Rui Li
Kaidong Zhang
Xin Luo
Dong Liu
DiffM
479
5
0
19 May 2023
LLMScore: Unveiling the Power of Large Language Models in Text-to-Image Synthesis Evaluation
Neural Information Processing Systems (NeurIPS), 2023
Yujie Lu
Xianjun Yang
Xiujun Li
Xinze Wang
William Yang Wang
EGVM
431
99
0
18 May 2023
Previous
1
2
3
...
26
27
28
29
30
Next
Page 27 of 30
Page
of 30
Go