v1v2 (latest)

Masked Feature Prediction for Self-Supervised Visual Pre-Training

16 December 2021

Christoph Feichtenhofer

ViT

ArXiv (abs)PDF HTML

Papers citing "Masked Feature Prediction for Self-Supervised Visual Pre-Training"

50 / 498 papers shown

ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting

295

01 Mar 2024

A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection

311

29 Feb 2024

LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning

Shentong Mo

Yansen Wang

Xufang Luo

Dongsheng Li

VLM

187

27 Feb 2024

The Common Stability Mechanism behind most Self-Supervised Learning Approaches

Abhishek Jha

Matthew B. Blaschko

Yuki M. Asano

Tinne Tuytelaars

SSL

135

22 Feb 2024

VideoPrism: A Foundational Visual Encoder for Video Understanding

...

386

20 Feb 2024

Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm PerspectivesIEEE Transactions on Intelligent Vehicles (TIV), 2024

...

Yi Yang

409

05 Feb 2024

MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning

Cheng Tan

Stan Z. Li

VLM

245

03 Feb 2024

MV2MAE: Multi-View Video Masked Autoencoders

248

29 Jan 2024

Harmonized Spatial and Spectral Learning for Robust and Generalized Medical Image Segmentation

Debesh Jha

210

18 Jan 2024

Collaboratively Self-supervised Video Representation Learning for Action RecognitionIEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024

378

15 Jan 2024

Motion Guided Token Compression for Efficient Masked Video Modeling

273

10 Jan 2024

Generic Knowledge Boosted Pre-training For Remote Sensing ImagesIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

Ziyue Huang

Mingming Zhang

Yuan Gong

Qingjie Liu

Yunhong Wang

VLM

186

09 Jan 2024

Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence

276

01 Jan 2024

Masked Modeling for Self-supervised Representation Learning on Vision and Beyond

Siyuan Li

Luyuan Zhang

Zedong Wang

Di Wu

Lirong Wu

...

Jun Xia

Cheng Tan

Yang Liu

Baigui Sun

Stan Z. Li

SSL

299

31 Dec 2023

Morphing Tokens Draw Strong Masked Image ModelsInternational Conference on Learning Representations (ICLR), 2023

Taekyung Kim

Byeongho Heo

Dongyoon Han

790

30 Dec 2023

Visual Point Cloud Forecasting enables Scalable Autonomous DrivingComputer Vision and Pattern Recognition (CVPR), 2023

Li Chen

381

29 Dec 2023

Video Understanding with Large Language Models: A Survey

...

717

167

29 Dec 2023

Learning Vision from Models Rivals Learning Vision from DataComputer Vision and Pattern Recognition (CVPR), 2023

279

28 Dec 2023

Bootstrap Masked Visual Modeling via Hard Patches Mining

Xiangyu Zhang

233

21 Dec 2023

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

Qizhe Zhang

Shanghang Zhang

261

19 Dec 2023

M-BEV: Masked BEV Perception for Robust Autonomous Driving

Siran Chen

Yue Ma

Yu Qiao

Yali Wang

289

19 Dec 2023

DMT: Comprehensive Distillation with Multiple Self-supervised Teachers

Yuang Liu

Jing Wang

Qiang-feng Zhou

Fan Wang

Jun Wang

Wei Zhang

153

19 Dec 2023

Semantic-Aware Autoregressive Image Modeling for Visual Representation LearningAAAI Conference on Artificial Intelligence (AAAI), 2023

191

16 Dec 2023

T-MAE: Temporal Masked Autoencoders for Point Cloud Representation LearningEuropean Conference on Computer Vision (ECCV), 2023

Weijie Wei

Fatemeh Karimi Nejadasl

Theo Gevers

Martin R. Oswald

3DPC

279

15 Dec 2023

PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images

174

13 Dec 2023

LMD: Faster Image Reconstruction with Latent Masking DiffusionAAAI Conference on Artificial Intelligence (AAAI), 2023

190

13 Dec 2023

4M: Massively Multimodal Masked Modeling

270

107

11 Dec 2023

Cross-BERT for Point Cloud Pretraining

Peng Li

Mingqiang Wei

189

08 Dec 2023

MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness

609

08 Dec 2023

Rejuvenating image-GPT as Strong Visual Representation LearnersInternational Conference on Machine Learning (ICML), 2023

Cihang Xie

283

04 Dec 2023

SCLIP: Rethinking Self-Attention for Dense Vision-Language InferenceEuropean Conference on Computer Vision (ECCV), 2023

Feng Wang

Jieru Mei

Yaoyao Liu

VLM

373

117

04 Dec 2023

SANeRF-HQ: Segment Anything for NeRF in High QualityComputer Vision and Pattern Recognition (CVPR), 2023

281

03 Dec 2023

Local Masking Meets Progressive Freezing: Crafting Efficient Vision Transformers for Self-Supervised LearningInternational Conference on Machine Vision (ICMV), 2023

Utku Mert Topcuoglu

Erdem Akagündüz

260

02 Dec 2023

Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked ModelingComputer Vision and Pattern Recognition (CVPR), 2023

Shentong Mo

Pedro Morgado

254

02 Dec 2023

Improve Supervised Representation Learning with Masked Image Modeling

Mojtaba Seyedhosseini

SSL ViT

272

01 Dec 2023

EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment AnythingComputer Vision and Pattern Recognition (CVPR), 2023

...

Raghuraman Krishnamoorthi

Vikas Chandra

VLM

371

236

01 Dec 2023

A-JEPA: Joint-Embedding Predictive Architecture Can Listen

Zhengcong Fei

Mingyuan Fan

Junshi Huang

383

27 Nov 2023

Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive ArchitectureIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023

504

26 Nov 2023

Understanding Self-Supervised Features for Learning Unsupervised Instance Segmentation

Christian Rupprecht

204

24 Nov 2023

Towards Transferable Multi-modal Perception Representation Learning for Autonomy: NeRF-Supervised Masked AutoEncoder

Xiaohao Xu

345

23 Nov 2023

Pair-wise Layer Attention with Spatial Masking for Video Prediction

211

19 Nov 2023

From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning

238

16 Nov 2023

Window Attention is Bugged: How not to Interpolate Position EmbeddingsInternational Conference on Learning Representations (ICLR), 2023

Daniel Bolya

Chaitanya K. Ryali

Judy Hoffman

Christoph Feichtenhofer

228

09 Nov 2023

Learning Discriminative Features for Crowd Counting

Haoyi Xiong

254

08 Nov 2023

OmniVec: Learning robust representations with cross modal sharingIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Siddharth Srivastava

Gaurav Sharma

SSL

288

07 Nov 2023

Asymmetric Masked Distillation for Pre-Training Small Foundation ModelsComputer Vision and Pattern Recognition (CVPR), 2023

Zhiyu Zhao

Bingkun Huang

Sen Xing

Gangshan Wu

Yu Qiao

Limin Wang

207

06 Nov 2023

ProS: Facial Omni-Representation Learning via Prototype-based Self-DistillationIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

284

03 Nov 2023

Concatenated Masked Autoencoders as Spatial-Temporal Learner

183

02 Nov 2023

HAP: Structure-Aware Masked Image Modeling for Human-Centric PerceptionNeural Information Processing Systems (NeurIPS), 2023

Zhongwei Qiu

...

Junyu Han

Errui Ding

Lanfen Lin

Leilei Gan

Jingdong Wang

224

31 Oct 2023

Pre-training with Random Orthogonal Projection Image ModelingInternational Conference on Learning Representations (ICLR), 2023

323

28 Oct 2023