Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.05303
Cited By
ELVIS: Empowering Locality of Vision Language Pre-training with Intra-modal Similarity
11 April 2023
Sumin Seo
Jaewoong Shin
Jaewoo Kang
Tae Soo Kim
Thijs Kooi
Re-assign community
ArXiv
PDF
HTML
Papers citing
"ELVIS: Empowering Locality of Vision Language Pre-training with Intra-modal Similarity"
4 / 4 papers shown
Title
UniCLIP: Unified Framework for Contrastive Language-Image Pre-training
Janghyeon Lee
Jongsuk Kim
Hyounguk Shon
Bumsoo Kim
Seung Wook Kim
Honglak Lee
Junmo Kim
CLIP
VLM
50
51
0
27 Sep 2022
Joint Learning of Localized Representations from Medical Images and Reports
Philipp Muller
Georgios Kaissis
Cong Zou
Daniel Munich
132
79
0
06 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision
Chao Jia
Yinfei Yang
Ye Xia
Yi-Ting Chen
Zarana Parekh
Hieu H. Pham
Quoc V. Le
Yun-hsuan Sung
Zhen Li
Tom Duerig
VLM
CLIP
293
3,683
0
11 Feb 2021
1