Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2112.09133
Cited By
Masked Feature Prediction for Self-Supervised Visual Pre-Training
16 December 2021
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Masked Feature Prediction for Self-Supervised Visual Pre-Training"
50 / 462 papers shown
Title
OmniMAE: Single Model Masked Pretraining on Images and Videos
Rohit Girdhar
Alaaeldin El-Nouby
Mannat Singh
Kalyan Vasudev Alwala
Armand Joulin
Ishan Misra
ViT
19
95
0
16 Jun 2022
iBoot: Image-bootstrapped Self-Supervised Video Representation Learning
F. Saleh
Fuwen Tan
Adrian Bulat
Georgios Tzimiropoulos
Brais Martínez
SSL
23
1
0
16 Jun 2022
Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency
Viraj Prabhu
Sriram Yenamandra
Aaditya K. Singh
Judy Hoffman
21
14
0
16 Jun 2022
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
Jiahao Xie
Wei Li
Xiaohang Zhan
Ziwei Liu
Yew-Soon Ong
Chen Change Loy
11
69
0
15 Jun 2022
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li
Zhe Gan
Kevin Qinghong Lin
Chung-Ching Lin
Zicheng Liu
Ce Liu
Lijuan Wang
MLLM
VLM
18
81
0
14 Jun 2022
SERE: Exploring Feature Self-relation for Self-supervised Transformer
Zhong-Yu Li
Shanghua Gao
Ming-Ming Cheng
ViT
MDE
19
14
0
10 Jun 2022
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
24
22
0
09 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
N. Sebe
ViT
MDE
17
1
0
09 Jun 2022
Towards Understanding Why Mask-Reconstruction Pretraining Helps in Downstream Tasks
Jia-Yu Pan
Pan Zhou
Shuicheng Yan
SSL
6
14
0
08 Jun 2022
Siamese Image Modeling for Self-Supervised Vision Representation Learning
Chenxin Tao
Xizhou Zhu
Weijie Su
Gao Huang
Bin Li
Jie Zhou
Yu Qiao
Xiaogang Wang
Jifeng Dai
SSL
24
94
0
02 Jun 2022
A Survey on Video Action Recognition in Sports: Datasets, Methods and Applications
Fei Wu
Qingzhong Wang
Jian Bian
Haoyi Xiong
Ning Ding
Feixiang Lu
Junqing Cheng
Dejing Dou
AI4TS
16
48
0
02 Jun 2022
Self-Supervised Visual Representation Learning with Semantic Grouping
Xin Wen
Bingchen Zhao
Anlin Zheng
X. Zhang
Xiaojuan Qi
SSL
101
71
0
30 May 2022
HiViT: Hierarchical Vision Transformer Meets Masked Image Modeling
Xiaosong Zhang
Yunjie Tian
Wei Huang
QiXiang Ye
Qi Dai
Lingxi Xie
Qi Tian
50
26
0
30 May 2022
SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang
Yangguang Li
Diana Marculescu
SSL
TPM
ViT
40
22
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Hongsheng Li
Peng Gao
3DPC
169
241
0
28 May 2022
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
17
15
0
28 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
17
47
0
27 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
141
631
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
11
68
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
17
53
0
26 May 2022
Pretraining is All You Need for Image-to-Image Translation
Tengfei Wang
Ting Zhang
Bo Zhang
Hao Ouyang
Dong Chen
Qifeng Chen
Fang Wen
DiffM
176
177
0
25 May 2022
A Study on Transformer Configuration and Training Objective
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
12
7
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
95
73
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
17
50
0
19 May 2022
Training Vision-Language Transformers from Captions
Liangke Gui
Yingshan Chang
Qiuyuan Huang
Subhojit Som
Alexander G. Hauptmann
Jianfeng Gao
Yonatan Bisk
VLM
ViT
172
11
0
19 May 2022
Global Contrast Masked Autoencoders Are Powerful Pathological Representation Learners
Hao Quan
Xingyu Li
Weixing Chen
Qun Bai
Mingchen Zou
Ruijie Yang
Tingting Zheng
R. Qi
Xin Gao
Xiaoyu Cui
MedIm
15
18
0
18 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
16
119
0
08 May 2022
Automatic segmentation of meniscus based on MAE self-supervision and point-line weak supervision paradigm
Yuhan Xie
Kexin Jiang
Zhiyong Zhang
Shaolong Chen
Xiaodong Zhang
Changzhen Qiu
8
1
0
07 May 2022
CoCa: Contrastive Captioners are Image-Text Foundation Models
Jiahui Yu
Zirui Wang
Vijay Vasudevan
Legg Yeung
Mojtaba Seyedhosseini
Yonghui Wu
VLM
CLIP
OffRL
27
1,249
0
04 May 2022
The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training
Hao Liu
Xinghua Jiang
Xin Li
Antai Guo
Deqiang Jiang
Bo Ren
11
36
0
18 Apr 2022
Masked Siamese Networks for Label-Efficient Learning
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
11
311
0
14 Apr 2022
Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels
Tianxin Tao
Daniele Reda
M. van de Panne
ViT
8
19
0
11 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
6
54
0
06 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
19
262
0
04 Apr 2022
Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification
Yang Luo
Zhineng Chen
Shengtian Zhou
Xieping Gao
12
1
0
31 Mar 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
11
38
0
29 Mar 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
21
59
0
27 Mar 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
10
19
0
27 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
25
1,114
0
23 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
19
161
0
21 Mar 2022
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
8
118
0
18 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
W. Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
23
448
0
13 Mar 2022
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
11
104
0
10 Mar 2022
Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
Zhiyuan Cai
Li Lin
Huaqing He
Xiaoying Tang
ViT
MedIm
11
28
0
09 Mar 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
6
225
0
21 Feb 2022
CommerceMM: Large-Scale Commerce MultiModal Representation Learning with Omni Retrieval
Licheng Yu
Jun Chen
Animesh Sinha
Mengjiao MJ Wang
Hugo Chen
Tamara L. Berg
Ning Zhang
VLM
17
39
0
15 Feb 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
24
823
0
07 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
9
86
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
19
384
0
07 Feb 2022
Mask-based Latent Reconstruction for Reinforcement Learning
Tao Yu
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
11
44
0
28 Jan 2022
Previous
1
2
3
...
10
8
9
Next