Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2112.09133
Cited By
v1
v2 (latest)
Masked Feature Prediction for Self-Supervised Visual Pre-Training
16 December 2021
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Masked Feature Prediction for Self-Supervised Visual Pre-Training"
50 / 498 papers shown
ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting
Chen Duan
Pei Fu
Shan Guo
Qianyi Jiang
Xiaoming Wei
VLM
295
14
0
01 Mar 2024
A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection
Chao Hao
Zitong Yu
Xin Liu
Jun Xu
Huanjing Yue
Jingyu Yang
ViT
311
23
0
29 Feb 2024
LSPT: Long-term Spatial Prompt Tuning for Visual Representation Learning
Shentong Mo
Yansen Wang
Xufang Luo
Dongsheng Li
VLM
187
3
0
27 Feb 2024
The Common Stability Mechanism behind most Self-Supervised Learning Approaches
Abhishek Jha
Matthew B. Blaschko
Yuki M. Asano
Tinne Tuytelaars
SSL
135
4
0
22 Feb 2024
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao
N. B. Gundavarapu
Liangzhe Yuan
Hao Zhou
Shen Yan
...
Huisheng Wang
Hartwig Adam
Mikhail Sirotenko
Ting Liu
Boqing Gong
VGen
386
68
0
20 Feb 2024
Delving into Multi-modal Multi-task Foundation Models for Road Scene Understanding: From Learning Paradigm Perspectives
IEEE Transactions on Intelligent Vehicles (TIV), 2024
Sheng Luo
Wei Chen
Wanxin Tian
Rui Liu
Luanxuan Hou
...
Ling Shao
Yi Yang
Bojun Gao
Qun Li
Guobin Wu
409
28
0
05 Feb 2024
MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning
Zhe Li
Laurence T. Yang
Bocheng Ren
Xin Nie
Zhangyang Gao
Cheng Tan
Stan Z. Li
VLM
245
30
0
03 Feb 2024
MV2MAE: Multi-View Video Masked Autoencoders
Ketul Shah
Robert Crandall
Jie Xu
Peng Zhou
Marian George
Mayank Bansal
Rama Chellappa
248
6
0
29 Jan 2024
Harmonized Spatial and Spectral Learning for Robust and Generalized Medical Image Segmentation
Vandan Gorade
Sparsh Mittal
Debesh Jha
Rekha Singhal
Ulas Bagci
210
3
0
18 Jan 2024
Collaboratively Self-supervised Video Representation Learning for Action Recognition
IEEE Transactions on Information Forensics and Security (IEEE TIFS), 2024
Jie Zhang
Zhifan Wan
Lanqing Hu
Stephen Lin
Shuzhe Wu
Shiguang Shan
TTA
378
2
0
15 Jan 2024
Motion Guided Token Compression for Efficient Masked Video Modeling
Yukun Feng
Yangming Shi
Fengze Liu
Tan Yan
273
0
0
10 Jan 2024
Generic Knowledge Boosted Pre-training For Remote Sensing Images
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024
Ziyue Huang
Mingming Zhang
Yuan Gong
Qingjie Liu
Yunhong Wang
VLM
186
21
0
09 Jan 2024
Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence
Ruizhuo Xu
Linzhi Huang
Mei Wang
Jiani Hu
Weihong Deng
ViT
MedIm
276
5
0
01 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
299
28
0
31 Dec 2023
Morphing Tokens Draw Strong Masked Image Models
International Conference on Learning Representations (ICLR), 2023
Taekyung Kim
Byeongho Heo
Dongyoon Han
790
3
0
30 Dec 2023
Visual Point Cloud Forecasting enables Scalable Autonomous Driving
Computer Vision and Pattern Recognition (CVPR), 2023
Zetong Yang
Li Chen
Yanan Sun
Guoying Gu
3DPC
381
92
0
29 Dec 2023
Video Understanding with Large Language Models: A Survey
Yunlong Tang
Jing Bi
Siting Xu
Luchuan Song
Susan Liang
...
Feng Zheng
Jianguo Zhang
Chenliang Xu
Jiebo Luo
Chenliang Xu
VLM
717
167
0
29 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Computer Vision and Pattern Recognition (CVPR), 2023
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
279
73
0
28 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
233
6
0
21 Dec 2023
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu
Ran Xu
Senqiao Yang
Renrui Zhang
Qizhe Zhang
Zehui Chen
Yandong Guo
Shanghang Zhang
TTA
261
26
0
19 Dec 2023
M-BEV: Masked BEV Perception for Robust Autonomous Driving
Siran Chen
Yue Ma
Yu Qiao
Yali Wang
289
18
0
19 Dec 2023
DMT: Comprehensive Distillation with Multiple Self-supervised Teachers
Yuang Liu
Jing Wang
Qiang-feng Zhou
Fan Wang
Jun Wang
Wei Zhang
153
1
0
19 Dec 2023
Semantic-Aware Autoregressive Image Modeling for Visual Representation Learning
AAAI Conference on Artificial Intelligence (AAAI), 2023
Kaiyou Song
Shan Zhang
Tong Wang
VLM
191
2
0
16 Dec 2023
T-MAE: Temporal Masked Autoencoders for Point Cloud Representation Learning
European Conference on Computer Vision (ECCV), 2023
Weijie Wei
Fatemeh Karimi Nejadasl
Theo Gevers
Martin R. Oswald
3DPC
279
10
0
15 Dec 2023
PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images
Tao Zhang
Kun Ding
Jinyong Wen
Yu Xiong
Zeyu Zhang
Shiming Xiang
Chunhong Pan
174
4
0
13 Dec 2023
LMD: Faster Image Reconstruction with Latent Masking Diffusion
AAAI Conference on Artificial Intelligence (AAAI), 2023
Zhiyuan Ma
Zhihuan Yu
Jianjun Li
Bowen Zhou
DiffM
190
13
0
13 Dec 2023
4M: Massively Multimodal Masked Modeling
David Mizrahi
Roman Bachmann
Ouguzhan Fatih Kar
Teresa Yeo
Mingfei Gao
Afshin Dehghan
Amir Zamir
MLLM
270
107
0
11 Dec 2023
Cross-BERT for Point Cloud Pretraining
Xin Li
Peng Li
Zeyong Wei
Zhe Zhu
Mingqiang Wei
Junhui Hou
Liangliang Nan
J. Qin
H. Xie
F. Wang
SSL
3DPC
189
2
0
08 Dec 2023
MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness
Xiaoyun Xu
Shujian Yu
Jingzheng Wu
S. Picek
AAML
609
8
0
08 Dec 2023
Rejuvenating image-GPT as Strong Visual Representation Learners
International Conference on Machine Learning (ICML), 2023
Sucheng Ren
Zeyu Wang
Hongru Zhu
Junfei Xiao
Yaoyao Liu
Cihang Xie
VLM
283
12
0
04 Dec 2023
SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference
European Conference on Computer Vision (ECCV), 2023
Feng Wang
Jieru Mei
Yaoyao Liu
VLM
373
117
0
04 Dec 2023
SANeRF-HQ: Segment Anything for NeRF in High Quality
Computer Vision and Pattern Recognition (CVPR), 2023
Yichen Liu
Benran Hu
Chi-Keung Tang
Yu-Wing Tai
281
23
0
03 Dec 2023
Local Masking Meets Progressive Freezing: Crafting Efficient Vision Transformers for Self-Supervised Learning
International Conference on Machine Vision (ICMV), 2023
Utku Mert Topcuoglu
Erdem Akagündüz
260
2
0
02 Dec 2023
Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling
Computer Vision and Pattern Recognition (CVPR), 2023
Shentong Mo
Pedro Morgado
254
31
0
02 Dec 2023
Improve Supervised Representation Learning with Masked Image Modeling
Kaifeng Chen
Daniel M. Salz
Huiwen Chang
Kihyuk Sohn
Dilip Krishnan
Mojtaba Seyedhosseini
SSL
ViT
272
3
0
01 Dec 2023
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Computer Vision and Pattern Recognition (CVPR), 2023
Yunyang Xiong
Bala Varadarajan
Lemeng Wu
Xiaoyu Xiang
Fanyi Xiao
...
Dilin Wang
Fei Sun
Forrest N. Iandola
Raghuraman Krishnamoorthi
Vikas Chandra
VLM
371
236
0
01 Dec 2023
A-JEPA: Joint-Embedding Predictive Architecture Can Listen
Zhengcong Fei
Mingyuan Fan
Junshi Huang
383
33
0
27 Nov 2023
Predicting Gradient is Better: Exploring Self-Supervised Learning for SAR ATR with a Joint-Embedding Predictive Architecture
Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023
Wei-Jang Li
Yang Wei
Tianpeng Liu
Yuenan Hou
Yuxuan Li
Zhen Liu
Yongxiang Liu
Tianpeng Liu
504
57
0
26 Nov 2023
Understanding Self-Supervised Features for Learning Unsupervised Instance Segmentation
Paul Engstler
Luke Melas-Kyriazi
Christian Rupprecht
Iro Laina
SSL
204
7
0
24 Nov 2023
Towards Transferable Multi-modal Perception Representation Learning for Autonomy: NeRF-Supervised Masked AutoEncoder
Xiaohao Xu
345
0
0
23 Nov 2023
Pair-wise Layer Attention with Spatial Masking for Video Prediction
Ping Li
Chenhan Zhang
Zheng Yang
Xianghua Xu
Mingli Song
211
0
0
19 Nov 2023
From Pretext to Purpose: Batch-Adaptive Self-Supervised Learning
Jiansong Zhang
Linlin Shen
Peizhong Liu
SSL
238
0
0
16 Nov 2023
Window Attention is Bugged: How not to Interpolate Position Embeddings
International Conference on Learning Representations (ICLR), 2023
Daniel Bolya
Chaitanya K. Ryali
Judy Hoffman
Christoph Feichtenhofer
228
18
0
09 Nov 2023
Learning Discriminative Features for Crowd Counting
Yuehai Chen
Qingzhong Wang
Jing Yang
Badong Chen
Haoyi Xiong
Shaoyi Du
254
14
0
08 Nov 2023
OmniVec: Learning robust representations with cross modal sharing
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Siddharth Srivastava
Gaurav Sharma
SSL
288
83
0
07 Nov 2023
Asymmetric Masked Distillation for Pre-Training Small Foundation Models
Computer Vision and Pattern Recognition (CVPR), 2023
Zhiyu Zhao
Bingkun Huang
Sen Xing
Gangshan Wu
Yu Qiao
Limin Wang
207
12
0
06 Nov 2023
ProS: Facial Omni-Representation Learning via Prototype-based Self-Distillation
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Xing Di
Yiyu Zheng
Xiaoming Liu
Yu Cheng
284
6
0
03 Nov 2023
Concatenated Masked Autoencoders as Spatial-Temporal Learner
Zhouqiang Jiang
Bowen Wang
Tong Xiang
Zhaofeng Niu
Hong Tang
Guangshun Li
Liangzhi Li
183
4
0
02 Nov 2023
HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception
Neural Information Processing Systems (NeurIPS), 2023
Junkun Yuan
Xinyu Zhang
Hao Zhou
Jian Wang
Zhongwei Qiu
...
Junyu Han
Errui Ding
Lanfen Lin
Leilei Gan
Jingdong Wang
224
28
0
31 Oct 2023
Pre-training with Random Orthogonal Projection Image Modeling
International Conference on Learning Representations (ICLR), 2023
Maryam Haghighat
Peyman Moghadam
Shaheer Mohamed
Piotr Koniusz
VLM
323
14
0
28 Oct 2023
Previous
1
2
3
4
5
...
8
9
10
Next