v1v2v3 (latest)

iBOT: Image BERT Pre-Training with Online Tokenizer

15 November 2021

Cihang Xie

Papers citing "iBOT: Image BERT Pre-Training with Online Tokenizer"

50 / 607 papers shown

Masked Siamese ConvNets

212

15 Jun 2022

A Simple Data Mixing Prior for Improving Self-Supervised LearningComputer Vision and Pattern Recognition (CVPR), 2022

Cihang Xie

183

15 Jun 2022

Rethinking Generalization in Few-Shot ClassificationNeural Information Processing Systems (NeurIPS), 2022

Mehrtash Harandi

358

15 Jun 2022

SERE: Exploring Feature Self-relation for Self-supervised TransformerIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022

Zhong-Yu Li

Shanghua Gao

Ming-Ming Cheng

ViT MDE

253

10 Jun 2022

Extreme Masking for Learning Instance and Distributed Visual Representations

297

09 Jun 2022

Spatial Entropy as an Inductive Bias for Vision TransformersMachine-mediated learning (ML), 2022

Wei Bi

286

09 Jun 2022

Can CNNs Be More Robust Than Transformers?International Conference on Learning Representations (ICLR), 2022

Cihang Xie

248

07 Jun 2022

On the duality between contrastive and non-contrastive self-supervised learningInternational Conference on Learning Representations (ICLR), 2022

307

112

03 Jun 2022

Siamese Image Modeling for Self-Supervised Vision Representation LearningComputer Vision and Pattern Recognition (CVPR), 2022

Gao Huang

Yu Qiao

301

107

02 Jun 2022

Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small DatasetsBrazilian Conference on Intelligent Systems (BRACIS), 2022

Leandro M. de Lima

R. Krohling

ViT MedIm

148

30 May 2022

Self-Supervised Visual Representation Learning with Semantic GroupingNeural Information Processing Systems (NeurIPS), 2022

Xin Wen

Xiaojuan Qi

417

30 May 2022

GMML is All you NeedInternational Conference on Information Photonics (ICIP), 2022

198

30 May 2022

A Closer Look at Self-Supervised Lightweight Vision TransformersInternational Conference on Machine Learning (ICML), 2022

286

28 May 2022

Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-trainingNeural Information Processing Systems (NeurIPS), 2022

Ziyu Guo

Yu Qiao

884

349

28 May 2022

Architecture-Agnostic Masked Image Modeling -- From ViT back to CNNInternational Conference on Machine Learning (ICML), 2022

Siyuan Li

228

27 May 2022

AdaptFormer: Adapting Vision Transformers for Scalable Visual RecognitionNeural Information Processing Systems (NeurIPS), 2022

Ping Luo

626

936

26 May 2022

Green Hierarchical Vision Transformer for Masked Image ModelingNeural Information Processing Systems (NeurIPS), 2022

Fei Wang

291

26 May 2022

HIRL: A General Framework for Hierarchical Image Representation Learning

153

26 May 2022

MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2022

303

26 May 2022

Decoder Denoising Pretraining for Semantic Segmentation

487

23 May 2022

A Study on Transformer Configuration and Training ObjectiveInternational Conference on Machine Learning (ICML), 2022

Xin Jiang

Yang You

208

21 May 2022

Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality

Xiang Li

Wenhai Wang

Lingfeng Yang

Jian Yang

305

20 May 2022

Masked Image Modeling with Denoising ContrastInternational Conference on Learning Representations (ICLR), 2022

Shusheng Yang

Ying Shan

209

19 May 2022

Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object DetectionIEEE International Conference on Computer Vision (ICCV), 2022

228

19 May 2022

Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder

206

10 May 2022

ConvMAE: Masked Convolution Meets Masked Autoencoders

Yu Qiao

256

151

08 May 2022

MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text RetrievalEuropean Conference on Computer Vision (ECCV), 2022

Ying Shan

Ping Luo

162

26 Apr 2022

A Masked Image Reconstruction Network for Document-level Relation Extraction

Li Zhang

Yidong Cheng

135

21 Apr 2022

The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-TrainingAAAI Conference on Artificial Intelligence (AAAI), 2022

Xin Li

188

18 Apr 2022

Masked Siamese Networks for Label-Efficient LearningEuropean Conference on Computer Vision (ECCV), 2022

Pascal Vincent

330

380

14 Apr 2022

DeiT III: Revenge of the ViTEuropean Conference on Computer Vision (ECCV), 2022

287

545

14 Apr 2022

Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels

209

11 Apr 2022

Representation Learning by Detecting Incorrect Location EmbeddingsAAAI Conference on Artificial Intelligence (AAAI), 2022

226

10 Apr 2022

Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object DetectionIEEE International Conference on Computer Vision (ICCV), 2022

Yuxin Fang

Shusheng Yang

Shijie Wang

Yixiao Ge

Ying Shan

Xinggang Wang

243

06 Apr 2022

MultiMAE: Multi-modal Multi-task Masked AutoencodersEuropean Conference on Computer Vision (ECCV), 2022

427

349

04 Apr 2022

Self-distillation Augmented Masked Autoencoders for Histopathological Image ClassificationIEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022

Yang Luo

Zhineng Chen

Shengtian Zhou

Xieping Gao

289

31 Mar 2022

In-N-Out Generative Learning for Dense Unsupervised Video SegmentationACM Multimedia (ACM MM), 2022

Chang Zhou

Hongxia Yang

Jingren Zhou

Yi Yang

VOS

241

29 Mar 2022

Large-scale Bilingual Language-Image Contrastive Learning

ByungSoo Ko

Geonmo Gu

VLM

277

28 Mar 2022

Mugs: A Multi-Granular Self-Supervised Learning Framework

Weihao Yu

190

27 Mar 2022

Single-Stream Multi-Level Alignment for Vision-Language PretrainingEuropean Conference on Computer Vision (ECCV), 2022

356

27 Mar 2022

VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-TrainingNeural Information Processing Systems (NeurIPS), 2022

739

1,640

23 Mar 2022

CP2: Copy-Paste Contrastive Pretraining for Semantic SegmentationEuropean Conference on Computer Vision (ECCV), 2022

260

22 Mar 2022

Three things everyone should know about Vision TransformersEuropean Conference on Computer Vision (ECCV), 2022

247

155

18 Mar 2022

MVP: Multimodality-guided Visual Pre-trainingEuropean Conference on Computer Vision (ECCV), 2022

236

128

10 Mar 2022

DiT: Self-supervised Pre-training for Document Image TransformerACM Multimedia (ACM MM), 2022

400

211

04 Mar 2022

Recent Advances and Challenges in Deep Audio-Visual Correlation Learning

Luís Vilacca

Yi Yu

Paula Viana

242

28 Feb 2022

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and LanguageInternational Conference on Machine Learning (ICML), 2022

569

1,037

07 Feb 2022

Corrupted Image Modeling for Self-Supervised Visual Pre-TrainingInternational Conference on Learning Representations (ICLR), 2022

303

07 Feb 2022

Context Autoencoder for Self-Supervised Representation LearningInternational Journal of Computer Vision (IJCV), 2022

Mingyu Ding

Shentong Mo

Jingdong Wang

487

454

07 Feb 2022

Adversarial Masking for Self-Supervised LearningInternational Conference on Machine Learning (ICML), 2022

448

101

31 Jan 2022