ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08254
  4. Cited By
BEiT: BERT Pre-Training of Image Transformers

BEiT: BERT Pre-Training of Image Transformers

15 June 2021
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
    ViT
ArXivPDFHTML

Papers citing "BEiT: BERT Pre-Training of Image Transformers"

50 / 1,788 papers shown
Title
3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point
  Clouds
3D-OAE: Occlusion Auto-Encoders for Self-Supervised Learning on Point Clouds
Junsheng Zhou
Xin Wen
Baorui Ma
Yu-Shen Liu
Yue Gao
Yi Fang
Zhizhong Han
3DPC
28
17
0
26 Mar 2022
Unsupervised Pre-Training on Patient Population Graphs for Patient-Level
  Predictions
Unsupervised Pre-Training on Patient Population Graphs for Patient-Level Predictions
Chantal Pellegrini
Anees Kazi
Nassir Navab
19
2
0
23 Mar 2022
Unsupervised Salient Object Detection with Spectral Cluster Voting
Unsupervised Salient Object Detection with Spectral Cluster Voting
Gyungin Shin
Samuel Albanie
Weidi Xie
13
65
0
23 Mar 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for
  Self-Supervised Video Pre-Training
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Zhan Tong
Yibing Song
Jue Wang
Limin Wang
ViT
125
1,122
0
23 Mar 2022
Visual Prompt Tuning
Visual Prompt Tuning
Menglin Jia
Luming Tang
Bor-Chun Chen
Claire Cardie
Serge J. Belongie
Bharath Hariharan
Ser-Nam Lim
VLM
VPVLM
40
1,519
0
23 Mar 2022
A Broad Study of Pre-training for Domain Generalization and Adaptation
A Broad Study of Pre-training for Domain Generalization and Adaptation
Donghyun Kim
Kaihong Wang
Stan Sclaroff
Kate Saenko
OOD
AI4CE
30
79
0
22 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
21
163
0
21 Mar 2022
MixFormer: End-to-End Tracking with Iterative Mixed Attention
MixFormer: End-to-End Tracking with Iterative Mixed Attention
Yutao Cui
Jiang Cheng
Limin Wang
Gangshan Wu
VOT
23
452
0
21 Mar 2022
Transformer-based HTR for Historical Documents
Transformer-based HTR for Historical Documents
Phillip Benjamin Strobel
Simon Clematide
M. Volk
Tobias Hodel
10
10
0
21 Mar 2022
Unified Multivariate Gaussian Mixture for Efficient Neural Image
  Compression
Unified Multivariate Gaussian Mixture for Efficient Neural Image Compression
Xiaosu Zhu
Jingkuan Song
Lianli Gao
Fengcai Zheng
Hengtao Shen
15
64
0
21 Mar 2022
Multi-Domain Multi-Definition Landmark Localization for Small Datasets
Multi-Domain Multi-Definition Landmark Localization for Small Datasets
D. Ferman
Gaurav Bharaj
CVBM
14
3
0
19 Mar 2022
Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion
Multi-Modal Masked Pre-Training for Monocular Panoramic Depth Completion
Zhiqiang Yan
Xiang Li
Kun Wang
Zhenyu Zhang
Jun Yu Li
Jian Yang
MDE
31
32
0
18 Mar 2022
Three things everyone should know about Vision Transformers
Three things everyone should know about Vision Transformers
Hugo Touvron
Matthieu Cord
Alaaeldin El-Nouby
Jakob Verbeek
Hervé Jégou
ViT
18
119
0
18 Mar 2022
DU-VLG: Unifying Vision-and-Language Generation via Dual
  Sequence-to-Sequence Pre-training
DU-VLG: Unifying Vision-and-Language Generation via Dual Sequence-to-Sequence Pre-training
Luyang Huang
Guocheng Niu
Jiachen Liu
Xinyan Xiao
Hua-Hong Wu
VLM
CoGe
14
7
0
17 Mar 2022
Object discovery and representation networks
Object discovery and representation networks
Olivier J. Hénaff
Skanda Koppula
Evan Shelhamer
Daniel Zoran
Andrew Jaegle
Andrew Zisserman
João Carreira
Relja Arandjelović
38
87
0
16 Mar 2022
Towards Practical Certifiable Patch Defense with Vision Transformer
Towards Practical Certifiable Patch Defense with Vision Transformer
Zhaoyu Chen
Bo-wen Li
Jianghe Xu
Shuang Wu
Shouhong Ding
Wenqiang Zhang
AAML
ViT
26
66
0
16 Mar 2022
Pushing the limits of raw waveform speaker recognition
Pushing the limits of raw waveform speaker recognition
Jee-weon Jung
You Jin Kim
Hee-Soo Heo
Bong-Jin Lee
Youngki Kwon
Joon Son Chung
31
87
0
16 Mar 2022
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose
  Estimation
P-STMO: Pre-Trained Spatial Temporal Many-to-One Model for 3D Human Pose Estimation
Wenkang Shan
Zhenhua Liu
Xinfeng Zhang
Shanshe Wang
Siwei Ma
Wen Gao
3DH
23
121
0
15 Mar 2022
Self-Promoted Supervision for Few-Shot Transformer
Self-Promoted Supervision for Few-Shot Transformer
Bowen Dong
Pan Zhou
Shuicheng Yan
W. Zuo
ViT
22
28
0
14 Mar 2022
Rethinking Minimal Sufficient Representation in Contrastive Learning
Rethinking Minimal Sufficient Representation in Contrastive Learning
Haoqing Wang
Xun Guo
Zhiwei Deng
Yan Lu
SSL
8
73
0
14 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
47
528
0
13 Mar 2022
Masked Autoencoders for Point Cloud Self-supervised Learning
Masked Autoencoders for Point Cloud Self-supervised Learning
Yatian Pang
Wenxiao Wang
Francis E. H. Tay
W. Liu
Yonghong Tian
Liuliang Yuan
3DPC
ViT
33
452
0
13 Mar 2022
Masked Visual Pre-training for Motor Control
Masked Visual Pre-training for Motor Control
Tete Xiao
Ilija Radosavovic
Trevor Darrell
Jitendra Malik
SSL
26
241
0
11 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
Visualizing and Understanding Patch Interactions in Vision Transformer
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
8
32
0
11 Mar 2022
Self Pre-training with Masked Autoencoders for Medical Image
  Classification and Segmentation
Self Pre-training with Masked Autoencoders for Medical Image Classification and Segmentation
Lei Zhou
Huidong Liu
Joseph Bae
Junjun He
Dimitris Samaras
Prateek Prasanna
MedIm
ViT
13
64
0
10 Mar 2022
MVP: Multimodality-guided Visual Pre-training
MVP: Multimodality-guided Visual Pre-training
Longhui Wei
Lingxi Xie
Wen-gang Zhou
Houqiang Li
Qi Tian
28
105
0
10 Mar 2022
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text
  Recognition and Document Enhancement
Text-DIAE: A Self-Supervised Degradation Invariant Autoencoders for Text Recognition and Document Enhancement
Mohamed Ali Souibgui
Sanket Biswas
Andrés Mafla
Ali Furkan Biten
Alicia Fornés
Yousri Kessentini
Josep Lladós
Lluís Gómez
Dimosthenis Karatzas
13
23
0
09 Mar 2022
Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image
  Modeling Transformer for Ophthalmic Image Classification
Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
Zhiyuan Cai
Li Lin
Huaqing He
Xiaoying Tang
ViT
MedIm
13
28
0
09 Mar 2022
RankSeg: Adaptive Pixel Classification with Image Category Ranking for
  Segmentation
RankSeg: Adaptive Pixel Classification with Image Category Ranking for Segmentation
Hao He
Yuhui Yuan
Xiangyu Yue
Han Hu
VOS
VLM
19
13
0
08 Mar 2022
Generating 3D Bio-Printable Patches Using Wound Segmentation and
  Reconstruction to Treat Diabetic Foot Ulcers
Generating 3D Bio-Printable Patches Using Wound Segmentation and Reconstruction to Treat Diabetic Foot Ulcers
H. Chae
Seunghwan Lee
H. Son
Seungjae Han
T. Lim
MedIm
28
3
0
08 Mar 2022
Continuous Self-Localization on Aerial Images Using Visual and Lidar
  Sensors
Continuous Self-Localization on Aerial Images Using Visual and Lidar Sensors
F. Fervers
Sebastian Bullinger
C. Bodensteiner
Michael Arens
Rainer Stiefelhagen
11
19
0
07 Mar 2022
UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired
  image-to-image translation
UVCGAN: UNet Vision Transformer cycle-consistent GAN for unpaired image-to-image translation
D. Torbunov
Yi Huang
Haiwang Yu
Jin-zhi Huang
Shinjae Yoo
Meifeng Lin
B. Viren
Yihui Ren
ViT
23
79
0
04 Mar 2022
DiT: Self-supervised Pre-training for Document Image Transformer
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li
Yiheng Xu
Tengchao Lv
Lei Cui
Chaoxi Zhang
Furu Wei
ViT
VLM
35
159
0
04 Mar 2022
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
ViT-P: Rethinking Data-efficient Vision Transformers from Locality
B. Chen
Ran A. Wang
Di Ming
Xin Feng
ViT
15
7
0
04 Mar 2022
DeepNet: Scaling Transformers to 1,000 Layers
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoE
AI4CE
15
155
0
01 Mar 2022
Unsupervised Vision-and-Language Pre-training via Retrieval-based
  Multi-Granular Alignment
Unsupervised Vision-and-Language Pre-training via Retrieval-based Multi-Granular Alignment
Mingyang Zhou
Licheng Yu
Amanpreet Singh
Mengjiao MJ Wang
Zhou Yu
Ning Zhang
VLM
25
31
0
01 Mar 2022
Multi-modal Alignment using Representation Codebook
Multi-modal Alignment using Representation Codebook
Jiali Duan
Liqun Chen
Son Tran
Jinyu Yang
Yi Xu
Belinda Zeng
Trishul M. Chilimbi
28
67
0
28 Feb 2022
HiP: Hierarchical Perceiver
HiP: Hierarchical Perceiver
João Carreira
Skanda Koppula
Daniel Zoran
Adrià Recasens
Catalin Ionescu
...
M. Botvinick
Oriol Vinyals
Karen Simonyan
Andrew Zisserman
Andrew Jaegle
VLM
28
14
0
22 Feb 2022
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for
  Image Recognition and Beyond
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond
Qiming Zhang
Yufei Xu
Jing Zhang
Dacheng Tao
ViT
24
229
0
21 Feb 2022
Visual Attention Network
Visual Attention Network
Meng-Hao Guo
Chengrou Lu
Zheng-Ning Liu
Ming-Ming Cheng
Shiyong Hu
ViT
VLM
19
637
0
20 Feb 2022
Vision Models Are More Robust And Fair When Pretrained On Uncurated
  Images Without Supervision
Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision
Priya Goyal
Quentin Duval
Isaac Seessel
Mathilde Caron
Ishan Misra
Levent Sagun
Armand Joulin
Piotr Bojanowski
VLM
SSL
26
110
0
16 Feb 2022
Meta Knowledge Distillation
Meta Knowledge Distillation
Jihao Liu
Boxiao Liu
Hongsheng Li
Yu Liu
18
25
0
16 Feb 2022
MaskGIT: Masked Generative Image Transformer
MaskGIT: Masked Generative Image Transformer
Huiwen Chang
Han Zhang
Lu Jiang
Ce Liu
William T. Freeman
ViT
29
617
0
08 Feb 2022
How to Understand Masked Autoencoders
How to Understand Masked Autoencoders
Shuhao Cao
Peng-Tao Xu
David A. Clifton
21
40
0
08 Feb 2022
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Yuxin Fang
Li Dong
Hangbo Bao
Xinggang Wang
Furu Wei
17
87
0
07 Feb 2022
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple
  Sequence-to-Sequence Learning Framework
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
Peng Wang
An Yang
Rui Men
Junyang Lin
Shuai Bai
Zhikang Li
Jianxin Ma
Chang Zhou
Jingren Zhou
Hongxia Yang
MLLM
ObjD
45
850
0
07 Feb 2022
Context Autoencoder for Self-Supervised Representation Learning
Context Autoencoder for Self-Supervised Representation Learning
Xiaokang Chen
Mingyu Ding
Xiaodi Wang
Ying Xin
Shentong Mo
Yunhao Wang
Shumin Han
Ping Luo
Gang Zeng
Jingdong Wang
SSL
45
386
0
07 Feb 2022
Self-supervised Learning with Random-projection Quantizer for Speech
  Recognition
Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Chung-Cheng Chiu
James Qin
Yu Zhang
Jiahui Yu
Yonghui Wu
SSL
19
162
0
03 Feb 2022
A Note on "Assessing Generalization of SGD via Disagreement"
A Note on "Assessing Generalization of SGD via Disagreement"
Andreas Kirsch
Y. Gal
FedML
UQCV
21
15
0
03 Feb 2022
AtmoDist: Self-supervised Representation Learning for Atmospheric
  Dynamics
AtmoDist: Self-supervised Representation Learning for Atmospheric Dynamics
Sebastian Hoffmann
C. Lessig
AI4Cl
24
8
0
02 Feb 2022
Previous
123...33343536
Next