Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 8,849 papers shown
Title
Semantic-Enhanced Image Clustering
Shao-Qian Cai
Li-qing Qiu
Xiaojun Chen
Qin Zhang
Long Chen
VLM
14
13
0
21 Aug 2022
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval
Haoran Wang
Dongliang He
Wenhao Wu
Boyang Xia
Min Yang
Fu Li
Yunlong Yu
Zhong Ji
Errui Ding
Jingdong Wang
19
22
0
21 Aug 2022
VLMAE: Vision-Language Masked Autoencoder
Su He
Taian Guo
Tao Dai
Ruizhi Qiao
Chen Wu
Xiujun Shu
Bohan Ren
VLM
19
11
0
19 Aug 2022
Demystifying Randomly Initialized Networks for Evaluating Generative Models
Junghyuk Lee
Jun-Hyuk Kim
Jong-Seok Lee
EGVM
22
0
0
19 Aug 2022
Discovering Bugs in Vision Models using Off-the-shelf Image Generation and Captioning
Olivia Wiles
Isabela Albuquerque
Sven Gowal
VLM
30
46
0
18 Aug 2022
Mere Contrastive Learning for Cross-Domain Sentiment Analysis
Yun Luo
Fang Guo
Zihan Liu
Yue Zhang
22
15
0
18 Aug 2022
Enhancing Diffusion-Based Image Synthesis with Robust Classifier Guidance
Bahjat Kawar
Roy Ganz
Michael Elad
DiffM
19
38
0
18 Aug 2022
See Finer, See More: Implicit Modality Alignment for Text-based Person Retrieval
Xiujun Shu
Wei Wen
Haoqian Wu
Keyun Chen
Yi-Zhe Song
Ruizhi Qiao
Bohan Ren
Xiao Wang
16
89
0
18 Aug 2022
Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork
Xin Yuan
Zhe-nan Lin
Jason Kuen
Jianming Zhang
John Collomosse
27
5
0
17 Aug 2022
Transformer Vs. MLP-Mixer: Exponential Expressive Gap For NLP Problems
D. Navon
A. Bronstein
MoE
22
0
0
17 Aug 2022
Towards Open-vocabulary Scene Graph Generation with Prompt-based Finetuning
Tao He
Lianli Gao
Jingkuan Song
Yuan-Fang Li
VLM
18
50
0
17 Aug 2022
Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides
Dong Won Lee
Chaitanya Ahuja
Paul Pu Liang
Sanika Natu
Louis-Philippe Morency
13
7
0
17 Aug 2022
The LAM Dataset: A Novel Benchmark for Line-Level Handwritten Text Recognition
S. Cascianelli
Vittorio Pippi
Martin Maarand
Marcella Cornia
Lorenzo Baraldi
Christopher Kermorvant
Rita Cucchiara
19
7
0
16 Aug 2022
M2HF: Multi-level Multi-modal Hybrid Fusion for Text-Video Retrieval
Shuo Liu
Weize Quan
Mingyuan Zhou
Sihong Chen
Jian Kang
Zhenlan Zhao
Chen Chen
Dong-Ming Yan
8
0
0
16 Aug 2022
Gradient Mask: Lateral Inhibition Mechanism Improves Performance in Artificial Neural Networks
Lei Jiang
Yongqing Liu
Shihai Xiao
Yansong Chua
21
0
0
14 Aug 2022
Domain-invariant Prototypes for Semantic Segmentation
Zhengeng Yang
Hongshan Yu
Wei Sun
Li Cheng
Ajmal Saeed Mian
30
2
0
12 Aug 2022
Quality Not Quantity: On the Interaction between Dataset Design and Robustness of CLIP
Thao Nguyen
Gabriel Ilharco
Mitchell Wortsman
Sewoong Oh
Ludwig Schmidt
CLIP
VLM
32
97
0
10 Aug 2022
HyperNST: Hyper-Networks for Neural Style Transfer
Dan Ruta
Andrew Gilbert
Saeid Motiian
Baldo Faieta
Zhe-nan Lin
John Collomosse
25
7
0
09 Aug 2022
Aesthetic Attributes Assessment of Images with AMANv2 and DPC-CaptionsV2
Xinghui Zhou
Xin Jin
Jianwen Lv
Heng Huang
Ming Mao
Shuai Cui
CoGe
16
0
0
09 Aug 2022
Creative Wand: A System to Study Effects of Communications in Co-Creative Settings
Zhiyu Lin
Rohan Agarwal
Mark O. Riedl
17
7
0
04 Aug 2022
Expanding Language-Image Pretrained Models for General Video Recognition
Bolin Ni
Houwen Peng
Minghao Chen
Songyang Zhang
Gaofeng Meng
Jianlong Fu
Shiming Xiang
Haibin Ling
VLM
CLIP
ViT
18
312
0
04 Aug 2022
MOVE: Effective and Harmless Ownership Verification via Embedded External Features
Yiming Li
Linghui Zhu
Xiaojun Jia
Yang Bai
Yong Jiang
Shutao Xia
Xiaochun Cao
Kui Ren
AAML
25
12
0
04 Aug 2022
Masked Vision and Language Modeling for Multi-modal Representation Learning
Gukyeong Kwon
Zhaowei Cai
Avinash Ravichandran
Erhan Bas
Rahul Bhotika
Stefano Soatto
22
67
0
03 Aug 2022
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation
Xingchen Li
Long Chen
Wenbo Ma
Yi Yang
Jun Xiao
11
26
0
03 Aug 2022
Prompt-to-Prompt Image Editing with Cross Attention Control
Amir Hertz
Ron Mokady
J. Tenenbaum
Kfir Aberman
Yael Pritch
Daniel Cohen-Or
DiffM
20
1,689
0
02 Aug 2022
The Curse of Low Task Diversity: On the Failure of Transfer Learning to Outperform MAML and Their Empirical Equivalence
Brando Miranda
P. Yu
Yu-xiong Wang
Oluwasanmi Koyejo
21
10
0
02 Aug 2022
Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics
Xiaoyuan Guo
Jiali Duan
C.-C. Jay Kuo
J. Gichoya
Imon Banerjee
VLM
14
1
0
31 Jul 2022
Cross-Modal Alignment Learning of Vision-Language Conceptual Systems
Taehyeong Kim
H. Song
Byoung-Tak Zhang
17
4
0
31 Jul 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
42
70
0
30 Jul 2022
ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval
Nicola Messina
Matteo Stefanini
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Giuseppe Amato
Rita Cucchiara
VLM
11
21
0
29 Jul 2022
Curriculum Learning for Data-Efficient Vision-Language Alignment
Tejas Srinivasan
Xiang Ren
Jesse Thomason
VLM
16
7
0
29 Jul 2022
Rewriting Geometric Rules of a GAN
Sheng-Yu Wang
David Bau
Jun-Yan Zhu
25
35
0
28 Jul 2022
Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Hao Shao
Letian Wang
Ruobing Chen
Hongsheng Li
Y. Liu
28
195
0
28 Jul 2022
NNSmith: Generating Diverse and Valid Test Cases for Deep Learning Compilers
Jiawei Liu
Jinkun Lin
Fabian Ruffy
Cheng Tan
Jinyang Li
Aurojit Panda
Lingming Zhang
55
57
0
26 Jul 2022
Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Robin Rombach
A. Blattmann
Bjorn Ommer
DiffM
14
70
0
26 Jul 2022
V
2
^2
2
L: Leveraging Vision and Vision-language Models into Large-scale Product Retrieval
Wenhao Wang
Yifan Sun
Zongxin Yang
Yi Yang
VLM
16
3
0
26 Jul 2022
S-Prompts Learning with Pre-trained Transformers: An Occam's Razor for Domain Incremental Learning
Yabin Wang
Zhiwu Huang
Xiaopeng Hong
CLL
VLM
15
208
0
26 Jul 2022
Equivariant and Invariant Grounding for Video Question Answering
Yicong Li
Xiang Wang
Junbin Xiao
Tat-Seng Chua
14
25
0
26 Jul 2022
Learning Visual Representation from Modality-Shared Contrastive Language-Image Pre-training
Haoxuan You
Luowei Zhou
Bin Xiao
Noel Codella
Yu Cheng
Ruochen Xu
Shih-Fu Chang
Lu Yuan
CLIP
VLM
19
47
0
26 Jul 2022
ArtFID: Quantitative Evaluation of Neural Style Transfer
Matthias Wright
Bjorn Ommer
EGVM
27
39
0
25 Jul 2022
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models
Huy Ha
Shuran Song
LM&Ro
VLM
28
101
0
23 Jul 2022
Contrastive Self-Supervised Learning Leads to Higher Adversarial Susceptibility
Rohit Gupta
Naveed Akhtar
Ajmal Saeed Mian
M. Shah
AAML
SSL
21
5
0
22 Jul 2022
Improving Privacy-Preserving Vertical Federated Learning by Efficient Communication with ADMM
Chulin Xie
Pin-Yu Chen
Qinbin Li
Arash Nourian
Ce Zhang
Bo Li
FedML
23
16
0
20 Jul 2022
Tackling Long-Tailed Category Distribution Under Domain Shifts
Xiao Gu
Yao Guo
Zeju Li
Jianing Qiu
Qianming Dou
Yuxuan Liu
Benny P. L. Lo
Guangxu Yang
OOD
35
12
0
20 Jul 2022
World Robot Challenge 2020 -- Partner Robot: A Data-Driven Approach for Room Tidying with Mobile Manipulator
T. Matsushima
Yukiyasu Noguchi
Jumpei Arima
Toshiki Aoki
Yuki Okita
...
Yuki Yamashita
Shoichi Seto
S. Gu
Yusuke Iwasawa
Yutaka Matsuo
25
8
0
20 Jul 2022
NUWA-Infinity: Autoregressive over Autoregressive Generation for Infinite Visual Synthesis
Chenfei Wu
Jian Liang
Xiaowei Hu
Zhe Gan
Jianfeng Wang
Lijuan Wang
Zicheng Liu
Yuejian Fang
Nan Duan
VGen
10
72
0
20 Jul 2022
The Anatomy of Video Editing: A Dataset and Benchmark Suite for AI-Assisted Video Editing
Dawit Mureja Argaw
Fabian Caba Heilbron
Joon-Young Lee
Markus Woodson
In So Kweon
VGen
37
22
0
20 Jul 2022
ShapeCrafter: A Recursive Text-Conditioned 3D Shape Generation Model
Rao Fu
Xiaoyu Zhan
Yiwen Chen
Daniel E. Ritchie
Srinath Sridhar
24
79
0
19 Jul 2022
Don't Stop Learning: Towards Continual Learning for the CLIP Model
Yuxuan Ding
Lingqiao Liu
Chunna Tian
Jingyuan Yang
Haoxuan Ding
CLL
VLM
KELM
13
50
0
19 Jul 2022
Calibrated ensembles can mitigate accuracy tradeoffs under distribution shift
Ananya Kumar
Tengyu Ma
Percy Liang
Aditi Raghunathan
UQCV
OODD
OOD
24
38
0
18 Jul 2022
Previous
1
2
3
...
167
168
169
...
175
176
177
Next