ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.09133
  4. Cited By
Masked Feature Prediction for Self-Supervised Visual Pre-Training
v1v2 (latest)

Masked Feature Prediction for Self-Supervised Visual Pre-Training

16 December 2021
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
    ViT
ArXiv (abs)PDFHTML

Papers citing "Masked Feature Prediction for Self-Supervised Visual Pre-Training"

50 / 498 papers shown
Affordance Grounding from Demonstration Video to Target Image
Affordance Grounding from Demonstration Video to Target ImageComputer Vision and Pattern Recognition (CVPR), 2023
Joya Chen
Difei Gao
Kevin Qinghong Lin
Mike Zheng Shou
181
44
0
26 Mar 2023
3Mformer: Multi-order Multi-mode Transformer for Skeletal Action
  Recognition
3Mformer: Multi-order Multi-mode Transformer for Skeletal Action RecognitionComputer Vision and Pattern Recognition (CVPR), 2023
Lei Wang
Piotr Koniusz
ViT
224
68
0
25 Mar 2023
Active Finetuning: Exploiting Annotation Budget in the
  Pretraining-Finetuning Paradigm
Active Finetuning: Exploiting Annotation Budget in the Pretraining-Finetuning ParadigmComputer Vision and Pattern Recognition (CVPR), 2023
Yichen Xie
Han Lu
Junchi Yan
Yunbo Wang
Masayoshi Tomizuka
Wei Zhan
248
44
0
25 Mar 2023
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D
  Representation Learning
Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation LearningComputer Vision and Pattern Recognition (CVPR), 2023
Xiaoyang Wu
Xin Wen
Xihui Liu
Hengshuang Zhao
3DPC
272
57
0
24 Mar 2023
Temperature Schedules for Self-Supervised Contrastive Methods on
  Long-Tail Data
Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail DataInternational Conference on Learning Representations (ICLR), 2023
Anna Kukleva
Moritz Bohle
Bernt Schiele
Hilde Kuehne
Christian Rupprecht
256
60
0
23 Mar 2023
The effectiveness of MAE pre-pretraining for billion-scale pretraining
The effectiveness of MAE pre-pretraining for billion-scale pretrainingIEEE International Conference on Computer Vision (ICCV), 2023
Mannat Singh
Quentin Duval
Kalyan Vasudev Alwala
Haoqi Fan
Vaibhav Aggarwal
...
Piotr Dollár
Christoph Feichtenhofer
Ross B. Girshick
Rohit Girdhar
Ishan Misra
LRM
377
86
0
23 Mar 2023
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation
  Models
FeatureNeRF: Learning Generalizable NeRFs by Distilling Foundation ModelsIEEE International Conference on Computer Vision (ICCV), 2023
Jianglong Ye
Naiyan Wang
Xinyu Wang
DiffM
247
53
0
22 Mar 2023
Correlational Image Modeling for Self-Supervised Visual Pre-Training
Correlational Image Modeling for Self-Supervised Visual Pre-TrainingComputer Vision and Pattern Recognition (CVPR), 2023
Wei Li
Jiahao Xie
Chen Change Loy
SSL
315
18
0
22 Mar 2023
ViC-MAE: Self-Supervised Representation Learning from Images and Video
  with Contrastive Masked Autoencoders
ViC-MAE: Self-Supervised Representation Learning from Images and Video with Contrastive Masked Autoencoders
J. Hernandez
Ruben Villegas
Vicente Ordonez
SSL
167
2
0
21 Mar 2023
FedMAE: Federated Self-Supervised Learning with One-Block Masked
  Auto-Encoder
FedMAE: Federated Self-Supervised Learning with One-Block Masked Auto-Encoder
Nan Yang
Xuanyu Chen
Charles Z. Liu
Dong Yuan
Wei Bao
Li-zhen Cui
189
5
0
20 Mar 2023
AdPE: Adversarial Positional Embeddings for Pretraining Vision
  Transformers via MAE+
AdPE: Adversarial Positional Embeddings for Pretraining Vision Transformers via MAE+
Tianlin Li
Ying Wang
Ziwei Xuan
Guo-Jun Qi
ViT
177
4
0
14 Mar 2023
DPPMask: Masked Image Modeling with Determinantal Point Processes
DPPMask: Masked Image Modeling with Determinantal Point ProcessesIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Junde Xu
Zikai Lin
Donghao Zhou
Yao-Cheng Yang
Xiangyun Liao
Bian Wu
Guangyong Chen
Pheng-Ann Heng
305
3
0
13 Mar 2023
Improving Masked Autoencoders by Learning Where to Mask
Improving Masked Autoencoders by Learning Where to MaskChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Haijia Chen
Wendong Zhang
Yunbo Wang
Xiaokang Yang
SSL
172
24
0
12 Mar 2023
Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature
  Mimicking
Mimic before Reconstruct: Enhancing Masked Autoencoders with Feature MimickingInternational Journal of Computer Vision (IJCV), 2023
Shiyang Feng
Renrui Zhang
Rongyao Fang
Ziyi Lin
Hongyang Li
Jiaming Song
Qiao Yu
169
25
0
09 Mar 2023
Masked Image Modeling with Local Multi-Scale Reconstruction
Masked Image Modeling with Local Multi-Scale ReconstructionComputer Vision and Pattern Recognition (CVPR), 2023
Haoqing Wang
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhiwei Deng
Kai Han
199
68
0
09 Mar 2023
Centroid-centered Modeling for Efficient Vision Transformer Pre-training
Centroid-centered Modeling for Efficient Vision Transformer Pre-trainingChinese Conference on Pattern Recognition and Computer Vision (CPRCV), 2023
Xin Yan
Zuchao Li
Lefei Zhang
Bo Du
Dacheng Tao
VLM
144
1
0
08 Mar 2023
Masked Images Are Counterfactual Samples for Robust Fine-tuning
Masked Images Are Counterfactual Samples for Robust Fine-tuningComputer Vision and Pattern Recognition (CVPR), 2023
Yao Xiao
Ziyi Tang
Pengxu Wei
Cong Liu
Guanbin Li
358
23
0
06 Mar 2023
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
PixMIM: Rethinking Pixel Reconstruction in Masked Image Modeling
Yuan Liu
Songyang Zhang
Jiacheng Chen
Kai-xiang Chen
Dahua Lin
249
38
0
04 Mar 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge
  Collaborative AutoML System
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System
Chao Xue
Wen Liu
Shunxing Xie
Zhenfang Wang
Jiaxing Li
...
Shi-Yong Chen
Yibing Zhan
Jing Zhang
Chaoyue Wang
Dacheng Tao
232
3
0
01 Mar 2023
Generic-to-Specific Distillation of Masked Autoencoders
Generic-to-Specific Distillation of Masked AutoencodersComputer Vision and Pattern Recognition (CVPR), 2023
Wei Huang
Zhiliang Peng
Li Dong
Furu Wei
Jianbin Jiao
QiXiang Ye
273
30
0
28 Feb 2023
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D PriorsComputer Vision and Pattern Recognition (CVPR), 2023
Ji Hou
Xiaoliang Dai
Zijian He
Angela Dai
Matthias Nießner
ViT3DPC
230
22
0
28 Feb 2023
Remote Sensing Scene Classification with Masked Image Modeling (MIM)
Remote Sensing Scene Classification with Masked Image Modeling (MIM)Remote Sensing (RS), 2023
Liya Wang
A. Tien
225
5
0
28 Feb 2023
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked
  Image Modeling For Label-Efficient Representations
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient RepresentationsInternational Conference on Learning Representations (ICLR), 2023
Ziyu Jiang
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Xiyang Dai
Lu Yuan
Zicheng Liu
Zinan Lin
SSLVLMCLIP
246
19
0
27 Feb 2023
EnfoMax: Domain Entropy and Mutual Information Maximization for Domain
  Generalized Face Anti-spoofing
EnfoMax: Domain Entropy and Mutual Information Maximization for Domain Generalized Face Anti-spoofingNeurocomputing (Neurocomputing), 2023
Tianyi Zheng
CVBM
221
4
0
17 Feb 2023
Semantic Image Segmentation: Two Decades of Research
Semantic Image Segmentation: Two Decades of ResearchFoundations and Trends in Computer Graphics and Vision (FTCGV), 2023
G. Csurka
Riccardo Volpi
Boris Chidlovskii
3DV
272
76
0
13 Feb 2023
Anatomical Invariance Modeling and Semantic Alignment for
  Self-supervised Learning in 3D Medical Image Analysis
Anatomical Invariance Modeling and Semantic Alignment for Self-supervised Learning in 3D Medical Image AnalysisIEEE International Conference on Computer Vision (ICCV), 2023
Yankai Jiang
Ming Sun
Heng Guo
Xiaoyu Bai
K. Yan
Le Lu
Minfeng Xu
MedIm
289
32
0
11 Feb 2023
AIM: Adapting Image Models for Efficient Video Action Recognition
AIM: Adapting Image Models for Efficient Video Action RecognitionInternational Conference on Learning Representations (ICLR), 2023
Taojiannan Yang
Yi Zhu
Yusheng Xie
Aston Zhang
Chong Chen
Mu Li
ViT
418
219
0
06 Feb 2023
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided
  by Generative Pretraining
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative PretrainingInternational Conference on Machine Learning (ICML), 2023
Zekun Qi
Runpei Dong
Guo Fan
Zheng Ge
Xiangyu Zhang
Kaisheng Ma
Li Yi
402
188
0
05 Feb 2023
MOMA:Distill from Self-Supervised Teachers
MOMA:Distill from Self-Supervised Teachers
Xingtai Lv
Nandakishor Desai
M. Palaniswami
254
5
0
04 Feb 2023
Energy-Inspired Self-Supervised Pretraining for Vision Models
Energy-Inspired Self-Supervised Pretraining for Vision ModelsInternational Conference on Learning Representations (ICLR), 2023
Ze Wang
Jiang Wang
Zicheng Liu
Qiang Qiu
247
10
0
02 Feb 2023
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)
Aerial Image Object Detection With Vision Transformer Detector (ViTDet)IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 2023
Liya Wang
A. Tien
414
19
0
28 Jan 2023
Compact Transformer Tracker with Correlative Masked Modeling
Compact Transformer Tracker with Correlative Masked ModelingAAAI Conference on Artificial Intelligence (AAAI), 2023
Zikai Song
Run Luo
Junqing Yu
Yi-Ping Phoebe Chen
Wei Yang
ViT
153
99
0
26 Jan 2023
Self-Supervised Learning from Images with a Joint-Embedding Predictive
  Architecture
Self-Supervised Learning from Images with a Joint-Embedding Predictive ArchitectureComputer Vision and Pattern Recognition (CVPR), 2023
Mahmoud Assran
Quentin Duval
Ishan Misra
Piotr Bojanowski
Pascal Vincent
Michael G. Rabbat
Yann LeCun
Nicolas Ballas
SSLAI4TSMDE
466
579
0
19 Jan 2023
Vision Learners Meet Web Image-Text Pairs
Vision Learners Meet Web Image-Text Pairs
Bingchen Zhao
Quan Cui
Hao Wu
Osamu Yoshie
Cheng Yang
Oisin Mac Aodha
VLM
183
6
0
17 Jan 2023
RILS: Masked Visual Reconstruction in Language Semantic Space
RILS: Masked Visual Reconstruction in Language Semantic SpaceComputer Vision and Pattern Recognition (CVPR), 2023
Shusheng Yang
Yixiao Ge
Kun Yi
Dian Li
Ying Shan
Xiaohu Qie
Xinggang Wang
CLIP
193
14
0
17 Jan 2023
A Survey on Self-supervised Learning: Algorithms, Applications, and
  Future Trends
A Survey on Self-supervised Learning: Algorithms, Applications, and Future TrendsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2023
Jie Gui
Tuo Chen
Jing Zhang
Qiong Cao
Zhe Sun
Haoran Luo
Dacheng Tao
569
354
0
13 Jan 2023
Toward Building General Foundation Models for Language, Vision, and
  Vision-Language Understanding Tasks
Toward Building General Foundation Models for Language, Vision, and Vision-Language Understanding TasksConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Xinsong Zhang
Yan Zeng
Jipeng Zhang
Hang Li
VLMAI4CELRM
302
18
0
12 Jan 2023
Designing BERT for Convolutional Networks: Sparse and Hierarchical
  Masked Modeling
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked ModelingInternational Conference on Learning Representations (ICLR), 2023
Keyu Tian
Yi Jiang
Qishuai Diao
Chen Lin
Liwei Wang
Zehuan Yuan
278
135
0
09 Jan 2023
Ego-Only: Egocentric Action Detection without Exocentric Transferring
Ego-Only: Egocentric Action Detection without Exocentric TransferringIEEE International Conference on Computer Vision (ICCV), 2023
Huiyu Wang
Mitesh Singh
Lorenzo Torresani
EgoV
353
35
0
03 Jan 2023
TinyMIM: An Empirical Study of Distilling MIM Pre-trained Models
TinyMIM: An Empirical Study of Distilling MIM Pre-trained ModelsComputer Vision and Pattern Recognition (CVPR), 2023
Sucheng Ren
Fangyun Wei
Zheng Zhang
Han Hu
321
51
0
03 Jan 2023
Disjoint Masking with Joint Distillation for Efficient Masked Image
  Modeling
Disjoint Masking with Joint Distillation for Efficient Masked Image ModelingIEEE transactions on multimedia (IEEE TMM), 2022
Xin Ma
Yu Xie
Chunyu Xie
Long Ye
Yafeng Deng
Xiang Ji
344
16
0
31 Dec 2022
Transformers in Action Recognition: A Review on Temporal Modeling
Transformers in Action Recognition: A Review on Temporal Modeling
Elham Shabaninia
Hossein Nezamabadi-pour
Fatemeh Shafizadegan
ViT
211
14
0
29 Dec 2022
Swin MAE: Masked Autoencoders for Small Datasets
Swin MAE: Masked Autoencoders for Small Datasets
Zián Xu
Yin Dai
Fayu Liu
Weibin Chen
Yue Liu
Li-Li Shi
Sheng Liu
Yuhang Zhou
SyDaMedImViT
264
39
0
28 Dec 2022
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image
  Transformers Help 3D Representation Learning?
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?International Conference on Learning Representations (ICLR), 2022
Runpei Dong
Zekun Qi
Linfeng Zhang
Junbo Zhang
Jian‐Yuan Sun
Zheng Ge
Li Yi
Kaisheng Ma
ViT3DPC
307
137
0
16 Dec 2022
Toward Improved Generalization: Meta Transfer of Self-supervised
  Knowledge on Graphs
Toward Improved Generalization: Meta Transfer of Self-supervised Knowledge on Graphs
Wenhui Cui
H. Akrami
Anand A. Joshi
Richard M. Leahy
168
1
0
16 Dec 2022
MAViL: Masked Audio-Video Learners
MAViL: Masked Audio-Video LearnersNeural Information Processing Systems (NeurIPS), 2022
Po-Yao (Bernie) Huang
Vasu Sharma
Hu Xu
Chaitanya K. Ryali
Haoqi Fan
Yanghao Li
Shang-Wen Li
Gargi Ghosh
Jitendra Malik
Christoph Feichtenhofer
322
73
0
15 Dec 2022
Efficient Self-supervised Learning with Contextualized Target
  Representations for Vision, Speech and Language
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and LanguageInternational Conference on Machine Learning (ICML), 2022
Alexei Baevski
Arun Babu
Wei-Ning Hsu
Michael Auli
VLMSSL
352
123
0
14 Dec 2022
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
FastMIM: Expediting Masked Image Modeling Pre-training for Vision
Jianyuan Guo
Kai Han
Han Wu
Yehui Tang
Yunhe Wang
Chang Xu
198
15
0
13 Dec 2022
Jointly Learning Visual and Auditory Speech Representations from Raw
  Data
Jointly Learning Visual and Auditory Speech Representations from Raw DataInternational Conference on Learning Representations (ICLR), 2022
A. Haliassos
Pingchuan Ma
Rodrigo Mira
Stavros Petridis
Maja Pantic
SSL
306
70
0
12 Dec 2022
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1
  Accuracy with ViT-B and ViT-L on ImageNet
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Shuyang Gu
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
166
50
0
12 Dec 2022
Previous
123...106789
Next