Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.09133
Cited By
v1
v2 (latest)
Masked Feature Prediction for Self-Supervised Visual Pre-Training
16 December 2021
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Feature Prediction for Self-Supervised Visual Pre-Training"
41 / 491 papers shown
Title
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
International Conference on Machine Learning (ICML), 2022
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
167
58
0
27 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Neural Information Processing Systems (NeurIPS), 2022
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
470
909
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Neural Information Processing Systems (NeurIPS), 2022
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
250
82
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Jiaming Song
188
76
0
26 May 2022
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
376
198
0
25 May 2022
A Study on Transformer Configuration and Training Objective
International Conference on Machine Learning (ICML), 2022
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
162
10
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
284
85
0
20 May 2022
Masked Image Modeling with Denoising Contrast
International Conference on Learning Representations (ICLR), 2022
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
163
65
0
19 May 2022
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
342
11
0
19 May 2022
Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners
Pattern Recognition (Pattern Recogn.), 2022
Hao Quan
Xingyu Li
Weixing Chen
Qun Bai
Mingchen Zou
Ruijie Yang
Tingting Zheng
R. Qi
Xin Gao
Xiaoyu Cui
MedIm
292
29
0
18 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Shiyang Feng
Teli Ma
Jiaming Song
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
216
150
0
08 May 2022
Automatic segmentation of meniscus based on MAE self-supervision and point-line weak supervision paradigm
Yuhan Xie
Kexin Jiang
Zhiyong Zhang
Shaolong Chen
Xiaodong Zhang
Changzhen Qiu
125
2
0
07 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
531
1,575
0
04 May 2022
The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hao Liu
Xinghua Jiang
Xin Li
Antai Guo
Deqiang Jiang
Bo Ren
172
43
0
18 Apr 2022
Masked Siamese Networks for Label-Efficient Learning
European Conference on Computer Vision (ECCV), 2022
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
274
373
0
14 Apr 2022
Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels
Tianxin Tao
Daniele Reda
M. van de Panne
ViT
196
20
0
11 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
IEEE International Conference on Computer Vision (ICCV), 2022
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
183
66
0
06 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
European Conference on Computer Vision (ECCV), 2022
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
339
339
0
04 Apr 2022
Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022
Yang Luo
Zhineng Chen
Shengtian Zhou
Xieping Gao
190
3
0
31 Mar 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
European Conference on Computer Vision (ECCV), 2022
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
308
44
0
29 Mar 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
140
67
0
27 Mar 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
143
21
0
27 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Neural Information Processing Systems (NeurIPS), 2022
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
616
1,578
0
23 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
European Conference on Computer Vision (ECCV), 2022
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
318
216
0
21 Mar 2022
Three things everyone should know about Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Edouard Grave
ViT
215
149
0
18 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
European Conference on Computer Vision (ECCV), 2022
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
Wen Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
245
610
0
13 Mar 2022
MVP: Multimodality-guided Visual Pre-training
European Conference on Computer Vision (ECCV), 2022
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
183
128
0
10 Mar 2022
Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2022
Zhiyuan Cai
Li Lin
Huaqing He
Xiaoying Tang
ViT
MedIm
170
31
0
09 Mar 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
International Journal of Computer Vision (IJCV), 2022
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
248
271
0
21 Feb 2022
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Knowledge Discovery and Data Mining (KDD), 2022
Licheng Yu
Jun Chen
Animesh Sinha
Mengjiao MJ Wang
Hugo Chen
Tamara L. Berg
Ning Zhang
VLM
226
44
0
15 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
International Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
412
1,014
0
07 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
International Conference on Learning Representations (ICLR), 2022
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
267
92
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
International Journal of Computer Vision (IJCV), 2022
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
390
443
0
07 Feb 2022
Mask-based Latent Reconstruction for Reinforcement Learning
Neural Information Processing Systems (NeurIPS), 2022
Tao Yu
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
219
59
0
28 Jan 2022
Video Transformers: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
Javier Selva
A. S. Johansen
Sergio Escalera
Kamal Nasrollahi
T. Moeslund
Albert Clapés
ViT
374
132
0
16 Jan 2022
DualFormer: Local-Global Stratified Transformer for Efficient Video Recognition
Yuxuan Liang
Pan Zhou
Roger Zimmermann
Shuicheng Yan
ViT
168
24
0
09 Dec 2021
Unsupervised Domain Generalization by Learning a Bridge Across Domains
Computer Vision and Pattern Recognition (CVPR), 2021
Sivan Harary
Eli Schwartz
Assaf Arbelle
Peter W. J. Staar
Shady Abu Hussein
...
Hildegard Kuehne
Dina Katabi
Kate Saenko
Rogerio Feris
Leonid Karlinsky
OOD
234
54
0
04 Dec 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
308
271
0
24 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Hai-Tao Zheng
Li Tao
Dun Liang
Haitao Zheng
542
112
0
07 Nov 2021
Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task
Pattern Recognition (Pattern Recogn.), 2021
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
SSL
210
4
0
01 Jun 2021
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
Mathilde Caron
Ishan Misra
Julien Mairal
Priya Goyal
Piotr Bojanowski
Armand Joulin
OCL
SSL
1.1K
4,618
0
17 Jun 2020
Previous
1
2
3
...
10
8
9