Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2111.12710
Cited By
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
24 November 2021
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers"
39 / 189 papers shown
Title
Object-wise Masked Autoencoders for Fast Pre-training
Jiantao Wu
Shentong Mo
ViT
OCL
17
15
0
28 May 2022
Architecture-Agnostic Masked Image Modeling -- From ViT back to CNN
Siyuan Li
Di Wu
Fang Wu
Lei Shang
Stan.Z.Li
22
47
0
27 May 2022
Green Hierarchical Vision Transformer for Masked Image Modeling
Lang Huang
Shan You
Mingkai Zheng
Fei Wang
Chao Qian
T. Yamasaki
13
68
0
26 May 2022
MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers
Jihao Liu
Xin Huang
Jinliang Zheng
Yu Liu
Hongsheng Li
19
53
0
26 May 2022
Decoder Denoising Pretraining for Semantic Segmentation
Emmanuel B. Asiedu
Simon Kornblith
Ting Chen
Niki Parmar
Matthias Minderer
Mohammad Norouzi
AI4CE
180
26
0
23 May 2022
Improvements to Self-Supervised Representation Learning for Masked Image Modeling
Jia-ju Mao
Xuesong Yin
Yuan Chang
Honggu Zhou
SSL
9
1
0
21 May 2022
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li
Wenhai Wang
Lingfeng Yang
Jian Yang
95
73
0
20 May 2022
Masked Image Modeling with Denoising Contrast
Kun Yi
Yixiao Ge
Xiaotong Li
Shusheng Yang
Dian Li
Jianping Wu
Ying Shan
Xiaohu Qie
VLM
19
50
0
19 May 2022
VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Yuchao Gu
Xintao Wang
Liangbin Xie
Chao Dong
Gengyan Li
Ying Shan
Mingg-Ming Cheng
22
112
0
13 May 2022
Reduce Information Loss in Transformers for Pluralistic Image Inpainting
Qiankun Liu
Zhentao Tan
Dongdong Chen
Qi Chu
Xiyang Dai
Yinpeng Chen
Mengchen Liu
Lu Yuan
Nenghai Yu
ViT
20
70
0
10 May 2022
ConvMAE: Masked Convolution Meets Masked Autoencoders
Peng Gao
Teli Ma
Hongsheng Li
Ziyi Lin
Jifeng Dai
Yu Qiao
ViT
19
119
0
08 May 2022
i-Code: An Integrative and Composable Multimodal Learning Framework
Ziyi Yang
Yuwei Fang
Chenguang Zhu
Reid Pryzant
Dongdong Chen
...
Bin Xiao
Yuanxun Lu
Takuya Yoshioka
Michael Zeng
Xuedong Huang
35
45
0
03 May 2022
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-text Retrieval
Yuying Ge
Yixiao Ge
Xihui Liu
Alex Jinpeng Wang
Jianping Wu
Ying Shan
Xiaohu Qie
Ping Luo
VLM
9
43
0
26 Apr 2022
The Devil is in the Frequency: Geminated Gestalt Autoencoder for Self-Supervised Visual Pre-Training
Hao Liu
Xinghua Jiang
Xin Li
Antai Guo
Deqiang Jiang
Bo Ren
16
36
0
18 Apr 2022
DeiT III: Revenge of the ViT
Hugo Touvron
Matthieu Cord
Hervé Jégou
ViT
21
383
0
14 Apr 2022
Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection
Yuxin Fang
Shusheng Yang
Shijie Wang
Yixiao Ge
Ying Shan
Xinggang Wang
6
54
0
06 Apr 2022
mc-BEiT: Multi-choice Discretization for Image BERT Pre-training
Xiaotong Li
Yixiao Ge
Kun Yi
Zixuan Hu
Ying Shan
Ling-yu Duan
11
38
0
29 Mar 2022
Mugs: A Multi-Granular Self-Supervised Learning Framework
Pan Zhou
Yichen Zhou
Chenyang Si
Weihao Yu
Teck Khim Ng
Shuicheng Yan
VLM
23
59
0
27 Mar 2022
Beyond Masking: Demystifying Token-Based Pre-Training for Vision Transformers
Yunjie Tian
Lingxi Xie
Jiemin Fang
Mengnan Shi
Junran Peng
Xiaopeng Zhang
Jianbin Jiao
Qi Tian
QiXiang Ye
12
19
0
27 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
37
1,114
0
23 Mar 2022
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
10
118
0
18 Mar 2022
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
11
104
0
10 Mar 2022
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement
Mohamed Ali Souibgui
Sanket Biswas
Andrés Mafla
Ali Furkan Biten
Alicia Fornés
Yousri Kessentini
Josep Lladós
Lluís Gómez
Dimosthenis Karatzas
13
23
0
09 Mar 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
11
86
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
22
384
0
07 Feb 2022
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
24
48
0
27 Dec 2021
General Facial Representation Learning in a Visual-Linguistic Manner
Yinglin Zheng
Hao Yang
Ting Zhang
Jianmin Bao
Dongdong Chen
Yangyu Huang
Lu Yuan
Dong Chen
Ming Zeng
Fang Wen
CVBM
132
161
0
06 Dec 2021
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
23
202
0
02 Dec 2021
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
258
7,337
0
11 Nov 2021
Are we ready for a new paradigm shift? A Survey on Visual Deep MLP
Ruiyang Liu
Yinghui Li
Li Tao
Dun Liang
Haitao Zheng
77
96
0
07 Nov 2021
Mobile-Former: Bridging MobileNet and Transformer
Yinpeng Chen
Xiyang Dai
Dongdong Chen
Mengchen Liu
Xiaoyi Dong
Lu Yuan
Zicheng Liu
ViT
172
462
0
12 Aug 2021
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
38
309
0
24 Jun 2021
Exploring the Diversity and Invariance in Yourself for Visual Pre-Training Task
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
SSL
13
3
0
01 Jun 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang
Enze Xie
Xiang Li
Deng-Ping Fan
Kaitao Song
Ding Liang
Tong Lu
Ping Luo
Ling Shao
ViT
263
3,538
0
24 Feb 2021
Zero-Shot Text-to-Image Generation
Aditya A. Ramesh
Mikhail Pavlov
Gabriel Goh
Scott Gray
Chelsea Voss
Alec Radford
Mark Chen
Ilya Sutskever
VLM
253
4,735
0
24 Feb 2021
Improved Baselines with Momentum Contrastive Learning
Xinlei Chen
Haoqi Fan
Ross B. Girshick
Kaiming He
SSL
238
3,359
0
09 Mar 2020
Adversarial examples in the physical world
Alexey Kurakin
Ian Goodfellow
Samy Bengio
SILM
AAML
250
5,813
0
08 Jul 2016
Pixel Recurrent Neural Networks
Aaron van den Oord
Nal Kalchbrenner
Koray Kavukcuoglu
SSeg
GAN
225
2,542
0
25 Jan 2016
Previous
1
2
3
4