Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.00020
Cited By
Learning Transferable Visual Models From Natural Language Supervision
26 February 2021
Alec Radford
Jong Wook Kim
Chris Hallacy
Aditya A. Ramesh
Gabriel Goh
Sandhini Agarwal
Girish Sastry
Amanda Askell
Pamela Mishkin
Jack Clark
Gretchen Krueger
Ilya Sutskever
CLIP
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Transferable Visual Models From Natural Language Supervision"
50 / 8,849 papers shown
Title
Dataset Distillation: A Comprehensive Review
Ruonan Yu
Songhua Liu
Xinchao Wang
DD
35
121
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
27
11
0
17 Jan 2023
USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval
Yan Zhang
Zhong Ji
Dingrong Wang
Yanwei Pang
Xuelong Li
VLM
16
21
0
17 Jan 2023
A Large-Scale Outdoor Multi-modal Dataset and Benchmark for Novel View Synthesis and Implicit Scene Reconstruction
Chongshan Lu
Fukun Yin
Xin Chen
Tao Chen
YU Gang
Jiayuan Fan
25
31
0
17 Jan 2023
UATVR: Uncertainty-Adaptive Text-Video Retrieval
Bo Fang
Wenhao Wu
Chang-rui Liu
Yu Zhou
Yuxin Song
Weiping Wang
Min Yang
Xiang Ji
Jingdong Wang
19
45
0
16 Jan 2023
Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning with Multimodal Models
Zhiqiu Lin
Samuel Yu
Zhiyi Kuang
Deepak Pathak
Deva Ramana
VLM
15
97
0
16 Jan 2023
AutoFraudNet: A Multimodal Network to Detect Fraud in the Auto Insurance Industry
Azin Asgarian
Rohit Saha
Daniel Jakubovitz
Julia Peyre
21
2
0
15 Jan 2023
Diatom-inspired architected materials using language-based deep learning: Perception, transformation and manufacturing
Markus J. Buehler
AI4CE
16
5
0
14 Jan 2023
GH-Feat: Learning Versatile Generative Hierarchical Features from GANs
Yinghao Xu
Yujun Shen
Jiapeng Zhu
Ceyuan Yang
Bolei Zhou
12
2
0
12 Jan 2023
See, Think, Confirm: Interactive Prompting Between Vision and Language Models for Knowledge-based Visual Reasoning
Zhenfang Chen
Qinhong Zhou
Yikang Shen
Yining Hong
Hao Zhang
Chuang Gan
LRM
VLM
29
35
0
12 Jan 2023
Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning
Huan Wang
Can Qin
Yue Bai
Yun Fu
32
20
0
12 Jan 2023
Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study
Mariya Hendriksen
Svitlana Vakulenko
E. Kuiper
Maarten de Rijke
21
5
0
12 Jan 2023
Poses of People in Art: A Data Set for Human Pose Estimation in Digital Art History
Stefanie Schneider
Ricarda Vollmer
3DH
19
5
0
12 Jan 2023
Artificial Intelligence Generated Coins for Size Comparison
Gerald Artner
17
0
0
11 Jan 2023
EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata
Chenhao Zheng
Ayush Shrivastava
Andrew Owens
VLM
22
11
0
11 Jan 2023
Pix2Map: Cross-modal Retrieval for Inferring Street Maps from Images
Xindi Wu
Kwun-fung Lau
Francesco Ferroni
Aljosa Osep
Deva Ramanan
26
7
0
10 Jan 2023
Transferring Pre-trained Multimodal Representations with Cross-modal Similarity Matching
Byoungjip Kim
Sun Choi
Dasol Hwang
Moontae Lee
Honglak Lee
14
10
0
07 Jan 2023
TarViS: A Unified Approach for Target-based Video Segmentation
A. Athar
Alexander Hermans
Jonathon Luiten
Deva Ramanan
Bastian Leibe
VOS
21
29
0
06 Jan 2023
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
13
4
0
06 Jan 2023
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh
Rohit Girdhar
Lorenzo Torresani
Kristen Grauman
VLM
AI4TS
20
51
0
05 Jan 2023
All in Tokens: Unifying Output Space of Visual Tasks via Soft Token
Jia Ning
Chen Li
Zheng-Wei Zhang
Zigang Geng
Qi Dai
Kun He
Han Hu
33
43
0
05 Jan 2023
Learning Trajectory-Word Alignments for Video-Language Tasks
Xu Yang
Zhang Li
Haiyang Xu
Hanwang Zhang
Qinghao Ye
Chenliang Li
Ming Yan
Yu Zhang
Fei Huang
Songfang Huang
23
7
0
05 Jan 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
Sucheng Ren
Fangyun Wei
Zheng-Wei Zhang
Han Hu
22
34
0
03 Jan 2023
Understanding Imbalanced Semantic Segmentation Through Neural Collapse
Zhisheng Zhong
Jiequan Cui
Yibo Yang
Xiaoyang Wu
Xiaojuan Qi
X. Zhang
Jiaya Jia
124
45
0
03 Jan 2023
PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Xiangtai Li
Shilin Xu
Yibo Yang
Haobo Yuan
Guangliang Cheng
Yu Tong
Zhouchen Lin
Ming-Hsuan Yang
Dacheng Tao
ViT
29
21
0
03 Jan 2023
Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation
Jianzong Wu
Xiangtai Li
Henghui Ding
Xia Li
Guangliang Cheng
Yu Tong
Chen Change Loy
VLM
80
31
0
02 Jan 2023
NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory
Santhosh Kumar Ramakrishnan
Ziad Al-Halah
Kristen Grauman
80
39
0
02 Jan 2023
DiRaC-I: Identifying Diverse and Rare Training Classes for Zero-Shot Learning
Sandipan Sarma
Arijit Sur
VLM
11
1
0
31 Dec 2022
Unlearnable Clusters: Towards Label-agnostic Unlearnable Examples
Jiaming Zhang
Xingjun Ma
Qiaomin Yi
Jitao Sang
Yugang Jiang
Yaowei Wang
Changsheng Xu
11
24
0
31 Dec 2022
Stroke-based Rendering: From Heuristics to Deep Learning
Florian Nolte
Andrew Melnik
Helge J. Ritter
GAN
33
5
0
30 Dec 2022
MVTN: Learning Multi-View Transformations for 3D Understanding
Abdullah Hamdi
Faisal AlZahrani
Silvio Giancola
Bernard Ghanem
3DPC
3DV
8
6
0
27 Dec 2022
DiffFace: Diffusion-based Face Swapping with Facial Guidance
Kihong Kim
Yunho Kim
Seokju Cho
Junyoung Seo
Jisu Nam
Kychul Lee
Seung Wook Kim
Kwanghee Lee
DiffM
16
51
0
27 Dec 2022
PaletteNeRF: Palette-based Color Editing for NeRFs
Qiling Wu
Jianchao Tan
Kun Xu
22
18
0
25 Dec 2022
Principled and Efficient Transfer Learning of Deep Models via Neural Collapse
Xiao Li
Sheng Liu
Jin-li Zhou
Xin Lu
C. Fernandez‐Granda
Zhihui Zhu
Q. Qu
AAML
19
18
0
23 Dec 2022
Robust Meta-Representation Learning via Global Label Inference and Classification
Ruohan Wang
Isak Falk
Massimiliano Pontil
C. Ciliberto
19
3
0
22 Dec 2022
Reversible Column Networks
Yuxuan Cai
Yi Zhou
Qi Han
Jianjian Sun
Xiangwen Kong
Jun Yu Li
Xiangyu Zhang
VLM
29
53
0
22 Dec 2022
Unleashing the Power of Visual Prompting At the Pixel Level
Junyang Wu
Xianhang Li
Chen Wei
Huiyu Wang
Alan Yuille
Yuyin Zhou
Cihang Xie
VPVLM
VLM
19
31
0
20 Dec 2022
A Length-Extrapolatable Transformer
Yutao Sun
Li Dong
Barun Patra
Shuming Ma
Shaohan Huang
Alon Benhaim
Vishrav Chaudhary
Xia Song
Furu Wei
24
115
0
20 Dec 2022
Can Current Task-oriented Dialogue Models Automate Real-world Scenarios in the Wild?
Sang-Woo Lee
Sungdong Kim
Donghyeon Ko
Dong-hyun Ham
Youngki Hong
...
Wangkyo Jung
Kyunghyun Cho
Donghyun Kwak
H. Noh
W. Park
36
1
0
20 Dec 2022
Position-guided Text Prompt for Vision-Language Pre-training
Alex Jinpeng Wang
Pan Zhou
Mike Zheng Shou
Shuicheng Yan
VLM
19
37
0
19 Dec 2022
Universal Object Detection with Large Vision Model
Feng-Huei Lin
Wenze Hu
Yaowei Wang
Yonghong Tian
Guangming Lu
Fanglin Chen
Yong-mei Xu
Xiaoyu Wang
VLM
ObjD
27
8
0
19 Dec 2022
AI Art in Architecture
J. Ploennigs
Markus Berger
DiffM
32
63
0
19 Dec 2022
SrTR: Self-reasoning Transformer with Visual-linguistic Knowledge for Scene Graph Generation
Yuxiang Zhang
Zhenbo Liu
Shuai Wang
ReLM
LRM
19
1
0
19 Dec 2022
Diffusing Surrogate Dreams of Video Scenes to Predict Video Memorability
Lorin Sweeney
Graham Healy
A. Smeaton
DiffM
13
2
0
19 Dec 2022
Transferring General Multimodal Pretrained Models to Text Recognition
Junyang Lin
Xuancheng Ren
Yichang Zhang
Gao Liu
Peng Wang
An Yang
Chang Zhou
32
4
0
19 Dec 2022
Face Generation and Editing with StyleGAN: A Survey
Andrew Melnik
Maksim Miasayedzenkau
Dzianis Makaravets
Dzianis Pirshtuk
Eren Akbulut
Dennis Holzmann
Tarek Renusch
Gustav Reichert
Helge J. Ritter
CVBM
19
39
0
18 Dec 2022
3D Point Cloud Pre-training with Knowledge Distillation from 2D Images
Yuan Yao
Yuanhan Zhang
Zhen-fei Yin
Jiebo Luo
Wanli Ouyang
Xiaoshui Huang
3DPC
14
10
0
17 Dec 2022
Foundation models in brief: A historical, socio-technical focus
Johannes Schneider
VLM
6
9
0
17 Dec 2022
Hyperbolic Hierarchical Contrastive Hashing
Rukai Wei
Yu Liu
Jingkuan Song
Yanzhao Xie
Ke Zhou
24
7
0
17 Dec 2022
Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Qiucheng Wu
Yujian Liu
Handong Zhao
Ajinkya Kale
T. Bui
Tong Yu
Zhe-nan Lin
Yang Zhang
Shiyu Chang
DiffM
CoGe
14
96
0
16 Dec 2022
Previous
1
2
3
...
160
161
162
...
175
176
177
Next