Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2104.02057
Cited By
An Empirical Study of Training Self-Supervised Vision Transformers
5 April 2021
Xinlei Chen
Saining Xie
Kaiming He
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"An Empirical Study of Training Self-Supervised Vision Transformers"
50 / 389 papers shown
Title
VeLO: Training Versatile Learned Optimizers by Scaling Up
Luke Metz
James Harrison
C. Freeman
Amil Merchant
Lucas Beyer
...
Naman Agrawal
Ben Poole
Igor Mordatch
Adam Roberts
Jascha Narain Sohl-Dickstein
24
60
0
17 Nov 2022
CPT-V: A Contrastive Approach to Post-Training Quantization of Vision Transformers
N. Frumkin
Dibakar Gope
Diana Marculescu
ViT
MQ
21
1
0
17 Nov 2022
Prompt Tuning for Parameter-efficient Medical Image Segmentation
Marc Fischer
Alexander Bartler
Bin Yang
SSeg
14
18
0
16 Nov 2022
MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis
Tianhong Li
Huiwen Chang
Shlok Kumar Mishra
Han Zhang
Dina Katabi
Dilip Krishnan
32
152
0
16 Nov 2022
Masked Reconstruction Contrastive Learning with Information Bottleneck Principle
Ziwen Liu
Bonan Li
Congying Han
Tiande Guo
Xuecheng Nie
SSL
32
2
0
15 Nov 2022
EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
Yuxin Fang
Wen Wang
Binhui Xie
Quan-Sen Sun
Ledell Yu Wu
Xinggang Wang
Tiejun Huang
Xinlong Wang
Yue Cao
VLM
CLIP
56
673
0
14 Nov 2022
Contrastive Self-Supervised Learning for Skeleton Representations
N. Lingg
Miguel Sarabia
Luca Zappella
B. Theobald
SSL
19
0
0
10 Nov 2022
Distilling Representations from GAN Generator via Squeeze and Span
Yu Yang
Xiaotian Cheng
Chang-rui Liu
Hakan Bilen
Xiang Ji
GAN
29
0
0
06 Nov 2022
Pixel-Wise Contrastive Distillation
Junqiang Huang
Zichao Guo
37
4
0
01 Nov 2022
Max Pooling with Vision Transformers reconciles class and shape in weakly supervised semantic segmentation
Simone Rossetti
Damiano Zappia
Marta Sanzari
M. Schaerf
F. Pirri
ViT
36
57
0
31 Oct 2022
A simple, efficient and scalable contrastive masked autoencoder for learning visual representations
Shlok Kumar Mishra
Joshua Robinson
Huiwen Chang
David Jacobs
Aaron Sarna
Aaron Maschinot
Dilip Krishnan
DiffM
43
30
0
30 Oct 2022
Open-vocabulary Semantic Segmentation with Frozen Vision-Language Models
Chaofan Ma
Yu-Hao Yang
Yanfeng Wang
Ya-Qin Zhang
Weidi Xie
VLM
21
48
0
27 Oct 2022
Exploiting Features and Logits in Heterogeneous Federated Learning
Yun-Hin Chan
Edith C. H. Ngai
FedML
24
2
0
27 Oct 2022
Learning Explicit Object-Centric Representations with Vision Transformers
Oscar Vikström
Alexander Ilin
OCL
ViT
30
4
0
25 Oct 2022
Deep Model Reassembly
Xingyi Yang
Zhou Daquan
Songhua Liu
Jingwen Ye
Xinchao Wang
MoMe
20
120
0
24 Oct 2022
Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
Zhiwei Lin
Ze Yang
Yongtao Wang
ViT
26
2
0
24 Oct 2022
Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present and Future
Guo-Jun Qi
M. Shah
SSL
23
8
0
23 Oct 2022
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
24
31
0
21 Oct 2022
Self-Supervised Learning via Maximum Entropy Coding
Xin Liu
Zhongdao Wang
Yali Li
Shengjin Wang
SSL
12
39
0
20 Oct 2022
Towards Sustainable Self-supervised Learning
Shanghua Gao
Pan Zhou
Mingg-Ming Cheng
Shuicheng Yan
CLL
37
7
0
20 Oct 2022
SSiT: Saliency-guided Self-supervised Image Transformer for Diabetic Retinopathy Grading
Yijin Huang
Junyan Lyu
Pujin Cheng
Roger Tam
Xiaoying Tang
ViT
MedIm
19
19
0
20 Oct 2022
A Unified View of Masked Image Modeling
Zhiliang Peng
Li Dong
Hangbo Bao
QiXiang Ye
Furu Wei
VLM
52
35
0
19 Oct 2022
Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers
Tao Tang
Changlin Li
Guangrun Wang
Kaicheng Yu
Xiaojun Chang
Xiaodan Liang
ViT
18
1
0
16 Oct 2022
When Adversarial Training Meets Vision Transformers: Recipes from Training to Architecture
Yi Mo
Dongxian Wu
Yifei Wang
Yiwen Guo
Yisen Wang
ViT
32
52
0
14 Oct 2022
Holo-Dex: Teaching Dexterity with Immersive Mixed Reality
Sridhar Pandian Arunachalam
Irmak Güzey
Soumith Chintala
Lerrel Pinto
32
67
0
12 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
10
57
0
12 Oct 2022
HiCo: Hierarchical Contrastive Learning for Ultrasound Video Model Pretraining
Chunhui Zhang
Yixiong Chen
Li Liu
Qiong Liu
Xiaoping Zhou
VLM
40
8
0
10 Oct 2022
Env-Aware Anomaly Detection: Ignore Style Changes, Stay True to Content!
Stefan Smeu
Elena Burceanu
Andrei Liviu Nicolicioiu
Emanuela Haller
27
4
0
06 Oct 2022
RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank
Q. Garrido
Randall Balestriero
Laurent Najman
Yann LeCun
SSL
46
72
0
05 Oct 2022
Learning Hierarchical Image Segmentation For Recognition and By Recognition
Tsung-Wei Ke
Sangwoo Mo
Stella X. Yu
VLM
27
9
0
01 Oct 2022
Slimmable Networks for Contrastive Self-supervised Learning
Shuai Zhao
Xiaohan Wang
Linchao Zhu
Yi Yang
22
1
0
30 Sep 2022
Bridging the Gap to Real-World Object-Centric Learning
Maximilian Seitzer
Max Horn
Andrii Zadaianchuk
Dominik Zietlow
Tianjun Xiao
...
Tong He
Zheng-Wei Zhang
Bernhard Schölkopf
Thomas Brox
Francesco Locatello
OCL
37
139
0
29 Sep 2022
Audio Barlow Twins: Self-Supervised Audio Representation Learning
Jonah Anton
H. Coppock
Pancham Shukla
Bjorn W. Schuller
BDL
SSL
40
8
0
28 Sep 2022
Multimodal Channel-Mixing: Channel and Spatial Masked AutoEncoder on Facial Action Unit Detection
Xiang Zhang
Huiyuan Yang
Taoyue Wang
Xiaotian Li
L. Yin
19
7
0
25 Sep 2022
Pretraining the Vision Transformer using self-supervised methods for vision based Deep Reinforcement Learning
Manuel Goulão
Arlindo L. Oliveira
ViT
33
6
0
22 Sep 2022
A Simple and Powerful Global Optimization for Unsupervised Video Object Segmentation
Georgy Ponimatkin
Nermin Samet
Yanghua Xiao
Yuming Du
Renaud Marlet
Vincent Lepetit
VOS
72
20
0
19 Sep 2022
Exploring Target Representations for Masked Autoencoders
Xingbin Liu
Jinghao Zhou
Tao Kong
Xianming Lin
Rongrong Ji
79
50
0
08 Sep 2022
Design of the topology for contrastive visual-textual alignment
Zhun Sun
25
1
0
05 Sep 2022
TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut
Yangtao Wang
Xiaoke Shen
Yuan. Yuan
Yuming Du
Maomao Li
S. Hu
James L. Crowley
Dominique Vaufreydaz
VOS
ViT
15
76
0
01 Sep 2022
CMD: Self-supervised 3D Action Representation Learning with Cross-modal Mutual Distillation
Yunyao Mao
Wen-gang Zhou
Zhenbo Lu
Jiajun Deng
Houqiang Li
28
38
0
26 Aug 2022
MaskCLIP: Masked Self-Distillation Advances Contrastive Language-Image Pretraining
Xiaoyi Dong
Jianmin Bao
Yinglin Zheng
Ting Zhang
Dongdong Chen
...
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
VLM
35
157
0
25 Aug 2022
Refine and Represent: Region-to-Object Representation Learning
Akash Gokul
Konstantinos Kallidromitis
Shufang Li
Yu Kato
Kazuki Kozuka
Trevor Darrell
Colorado Reed
SSeg
27
5
0
25 Aug 2022
Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease Diagnosis
Yawen Wu
Dewen Zeng
Zhepeng Wang
Yi Sheng
Lei Yang
A. James
Yiyu Shi
Jingtong Hu
18
7
0
24 Aug 2022
RenyiCL: Contrastive Representation Learning with Skew Renyi Divergence
Kyungmin Lee
Jinwoo Shin
SSL
DRL
27
10
0
12 Aug 2022
On the Pros and Cons of Momentum Encoder in Self-Supervised Visual Representation Learning
T. Pham
Chaoning Zhang
Axi Niu
Kang Zhang
Chang-Dong Yoo
36
11
0
11 Aug 2022
How Well Do Vision Transformers (VTs) Transfer To The Non-Natural Image Domain? An Empirical Study Involving Art Classification
Vincent Tonkes
M. Sabatelli
ViT
25
6
0
09 Aug 2022
MVSFormer: Multi-View Stereo by Learning Robust Image Features and Temperature-based Depth
Chenjie Cao
Xinlin Ren
Yanwei Fu
22
44
0
04 Aug 2022
Multi-Feature Vision Transformer via Self-Supervised Representation Learning for Improvement of COVID-19 Diagnosis
Xiao Qi
D. Foran
J. Nosher
I. Hacihaliloglu
ViT
MedIm
22
3
0
03 Aug 2022
A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond
Chaoning Zhang
Chenshuang Zhang
Junha Song
John Seon Keun Yi
Kang Zhang
In So Kweon
SSL
52
71
0
30 Jul 2022
Contrastive Masked Autoencoders are Stronger Vision Learners
Zhicheng Huang
Xiaojie Jin
Cheng Lu
Qibin Hou
Mingg-Ming Cheng
Dongmei Fu
Xiaohui Shen
Jiashi Feng
31
147
0
27 Jul 2022
Previous
1
2
3
4
5
6
7
8
Next