Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2111.07832
Cited By
v1
v2
v3 (latest)
iBOT: Image BERT Pre-Training with Online Tokenizer
15 November 2021
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"iBOT: Image BERT Pre-Training with Online Tokenizer"
50 / 602 papers shown
Title
Extreme Masking for Learning Instance and Distributed Visual Representations
Zhirong Wu
Zihang Lai
Xiao Sun
Stephen Lin
263
24
0
09 Jun 2022
Spatial Entropy as an Inductive Bias for Vision Transformers
Machine-mediated learning (ML), 2022
E. Peruzzo
E. Sangineto
Yahui Liu
Marco De Nadai
Wei Bi
Bruno Lepri
Andrii Zadaianchuk
ViT
MDE
223
5
0
09 Jun 2022
Can CNNs Be More Robust Than Transformers?
International Conference on Learning Representations (ICLR), 2022
Zeyu Wang
Yutong Bai
Yuyin Zhou
Cihang Xie
UQCV
OOD
189
52
0
07 Jun 2022
On the duality between contrastive and non-contrastive self-supervised learning
International Conference on Learning Representations (ICLR), 2022
Q. Garrido
Yubei Chen
Adrien Bardes
Laurent Najman
Yann LeCun
SSL
255
108
0
03 Jun 2022
Siamese Image Modeling for Self-Supervised Vision Representation Learning
Computer Vision and Pattern Recognition (CVPR), 2022
Chenxin Tao
Xizhou Zhu
Weijie Su
Gao Huang
Bin Li
Jie Zhou
Yu Qiao
Xiaogang Wang
Jifeng Dai
SSL
236
107
0
02 Jun 2022
Exploring Advances in Transformers and CNN for Skin Lesion Diagnosis on Small Datasets
Brazilian Conference on Intelligent Systems (BRACIS), 2022
Leandro M. de Lima
R. Krohling
ViT
MedIm
120
13
0
30 May 2022
Self-Supervised Visual Representation Learning with Semantic Grouping
Neural Information Processing Systems (NeurIPS), 2022
Xin Wen
Bingchen Zhao
Anlin Zheng
Xinming Zhang
Xiaojuan Qi
SSL
359
82
0
30 May 2022
GMML is All you Need
International Conference on Information Photonics (ICIP), 2022
Sara Atito
Muhammad Awais
J. Kittler
ViT
VLM
167
20
0
30 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
International Conference on Machine Learning (ICML), 2022
Shaoru Wang
Jin Gao
Zeming Li
Jian Sun
Weiming Hu
ViT
249
56
0
28 May 2022
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training
Neural Information Processing Systems (NeurIPS), 2022
Renrui Zhang
Ziyu Guo
Rongyao Fang
Bingyan Zhao
Dong Wang
Yu Qiao
Jiaming Song
Shiyang Feng
3DPC
687
338
0
28 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
International Conference on Machine Learning (ICML), 2022
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
163
58
0
27 May 2022
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition
Neural Information Processing Systems (NeurIPS), 2022
Shoufa Chen
Chongjian Ge
Zhan Tong
Jiangliu Wang
Yibing Song
Jue Wang
Ping Luo
470
909
0
26 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Neural Information Processing Systems (NeurIPS), 2022
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
250
82
0
26 May 2022
HIRL: A General Framework for Hierarchical Image Representation Learning
Minghao Xu
Yuanfan Guo
Xuanyu Zhu
Jiawen Li
Zhenbang Sun
Jiangtao Tang
Yi Xu
Bingbing Ni
SSL
104
3
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Jiaming Song
188
76
0
26 May 2022
Decoder Denoising Pretraining for Semantic Segmentation
Emmanuel B. Asiedu
Simon Kornblith
Ting Chen
Niki Parmar
Matthias Minderer
Mohammad Norouzi
AI4CE
464
28
0
23 May 2022
A Study on Transformer Configuration and Training Objective
International Conference on Machine Learning (ICML), 2022
Fuzhao Xue
Jianghai Chen
Aixin Sun
Xiaozhe Ren
Zangwei Zheng
Xiaoxin He
Yongming Chen
Xin Jiang
Yang You
162
10
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
284
85
0
20 May 2022
Masked Image Modeling with Denoising Contrast
International Conference on Learning Representations (ICLR), 2022
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
163
65
0
19 May 2022
Integrally Migrating Pre-trained Transformer Encoder-decoders for Visual Object Detection
IEEE International Conference on Computer Vision (ICCV), 2022
Yifan Zhang
Xiaosong Zhang
Zhiliang Peng
Zonghao Guo
Fang Wan
Xian-Wei Ji
QiXiang Ye
ObjD
196
26
0
19 May 2022
Multiplexed Immunofluorescence Brain Image Analysis Using Self-Supervised Dual-Loss Adaptive Masked Autoencoder
S. Ly
Bai Lin
Hung Q. Vo
D. Maric
B. Roysam
H. V. Nguyen
136
0
0
10 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Shiyang Feng
Teli Ma
Jiaming Song
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
216
150
0
08 May 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
European Conference on Computer Vision (ECCV), 2022
Yuying Ge
Yixiao Ge
Xihui Liu
Alex Jinpeng Wang
Jianping Wu
Ying Shan
Xiaohu Qie
Ping Luo
VLM
133
47
0
26 Apr 2022
A Masked Image Reconstruction Network for Document-level Relation Extraction
Li Zhang
Yidong Cheng
109
2
0
21 Apr 2022
The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training
AAAI Conference on Artificial Intelligence (AAAI), 2022
Hao Liu
Xinghua Jiang
Xin Li
Antai Guo
Deqiang Jiang
Bo Ren
172
43
0
18 Apr 2022
Masked Siamese Networks for Label-Efficient Learning
European Conference on Computer Vision (ECCV), 2022
Mahmoud Assran
Mathilde Caron
Ishan Misra
Piotr Bojanowski
Florian Bordes
Pascal Vincent
Armand Joulin
Michael G. Rabbat
Nicolas Ballas
SSL
274
373
0
14 Apr 2022
DeiT III: Revenge of the ViT
European Conference on Computer Vision (ECCV), 2022
Hugo Touvron
Matthieu Cord
Edouard Grave
ViT
218
520
0
14 Apr 2022
Evaluating Vision Transformer Methods for Deep Reinforcement Learning from Pixels
Tianxin Tao
Daniele Reda
M. van de Panne
ViT
196
20
0
11 Apr 2022
Representation Learning by Detecting Incorrect Location Embeddings
AAAI Conference on Artificial Intelligence (AAAI), 2022
Sepehr Sameni
Simon Jenni
Paolo Favaro
ViT
180
6
0
10 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
IEEE International Conference on Computer Vision (ICCV), 2022
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
183
66
0
06 Apr 2022
MultiMAE: Multi-modal Multi-task Masked Autoencoders
European Conference on Computer Vision (ECCV), 2022
Roman Bachmann
David Mizrahi
Andrei Atanov
Amir Zamir
339
339
0
04 Apr 2022
Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification
IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022
Yang Luo
Zhineng Chen
Shengtian Zhou
Xieping Gao
190
3
0
31 Mar 2022
In-N-Out Generative Learning for Dense Unsupervised Video Segmentation
ACM Multimedia (ACM MM), 2022
Xiaomiao Pan
Peike Li
Zongxin Yang
Huiling Zhou
Chang Zhou
Hongxia Yang
Jingren Zhou
Yi Yang
VOS
199
12
0
29 Mar 2022
Large-scale Bilingual Language-Image Contrastive Learning
ByungSoo Ko
Geonmo Gu
VLM
210
16
0
28 Mar 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
140
67
0
27 Mar 2022
Single-Stream Multi-Level Alignment for Vision-Language Pretraining
European Conference on Computer Vision (ECCV), 2022
Zaid Khan
B. Vijaykumar
Xiang Yu
S. Schulter
Manmohan Chandraker
Y. Fu
CLIP
VLM
252
21
0
27 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Neural Information Processing Systems (NeurIPS), 2022
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
604
1,578
0
23 Mar 2022
CP2: Copy-Paste Contrastive Pretraining for Semantic Segmentation
European Conference on Computer Vision (ECCV), 2022
Feng Wang
Huiyu Wang
Chen Wei
Alan Yuille
Wei Shen
SSL
VLM
220
36
0
22 Mar 2022
Three things everyone should know about Vision Transformers
European Conference on Computer Vision (ECCV), 2022
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Edouard Grave
ViT
215
149
0
18 Mar 2022
MVP: Multimodality-guided Visual Pre-training
European Conference on Computer Vision (ECCV), 2022
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
175
128
0
10 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
ACM Multimedia (ACM MM), 2022
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
329
201
0
04 Mar 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
International Conference on Machine Learning (ICML), 2022
Alexei Baevski
Wei-Ning Hsu
Qiantong Xu
Arun Babu
Jiatao Gu
Michael Auli
SSL
VLM
ViT
408
1,013
0
07 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
International Conference on Learning Representations (ICLR), 2022
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
267
92
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
International Journal of Computer Vision (IJCV), 2022
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
382
443
0
07 Feb 2022
Adversarial Masking for Self-Supervised Learning
International Conference on Machine Learning (ICML), 2022
Yuge Shi
N. Siddharth
Juil Sock
Adam R. Kosiorek
SSL
359
99
0
31 Jan 2022
A Frustratingly Simple Approach for End-to-End Image Captioning
Ziyang Luo
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
MLLM
197
19
0
30 Jan 2022
RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
International Joint Conference on Artificial Intelligence (IJCAI), 2022
Luyang Wang
Feng Liang
Yangguang Li
Honggang Zhang
Wanli Ouyang
Jing Shao
ViT
168
27
0
18 Jan 2022
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
Sara Atito
Muhammad Awais
Ammarah Farooq
Zhenhua Feng
J. Kittler
109
17
0
30 Nov 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
416
894
0
29 Nov 2021
ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment
Robin Karlsson
Tomoki Hayashi
Keisuke Fujii
Alexander Carballo
Kento Ohtani
K. Takeda
SSL
242
5
0
24 Nov 2021
Previous
1
2
3
...
11
12
13
Next