ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08254
  4. Cited By
BEiT: BERT Pre-Training of Image Transformers

BEiT: BERT Pre-Training of Image Transformers

15 June 2021
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
    ViT
ArXivPDFHTML

Papers citing "BEiT: BERT Pre-Training of Image Transformers"

50 / 1,788 papers shown
Title
Adversarial Masking for Self-Supervised Learning
Adversarial Masking for Self-Supervised Learning
Yuge Shi
N. Siddharth
Philip H. S. Torr
Adam R. Kosiorek
SSL
48
82
0
31 Jan 2022
A Frustratingly Simple Approach for End-to-End Image Captioning
A Frustratingly Simple Approach for End-to-End Image Captioning
Ziyang Luo
Yadong Xi
Rongsheng Zhang
Jing Ma
VLM
MLLM
19
16
0
30 Jan 2022
Mask-based Latent Reconstruction for Reinforcement Learning
Mask-based Latent Reconstruction for Reinforcement Learning
Tao Yu
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
19
44
0
28 Jan 2022
Learning to Compose Diversified Prompts for Image Emotion Classification
Learning to Compose Diversified Prompts for Image Emotion Classification
Sinuo Deng
Lifang Wu
Ge Shi
Lehao Xing
Meng Jian
Ye Xiang
CLIP
VLM
16
26
0
26 Jan 2022
ShapeFormer: Transformer-based Shape Completion via Sparse
  Representation
ShapeFormer: Transformer-based Shape Completion via Sparse Representation
Xingguang Yan
Liqiang Lin
Niloy J. Mitra
Dani Lischinski
Daniel Cohen-Or
Hui Huang
ViT
58
112
0
25 Jan 2022
Transformers in Medical Imaging: A Survey
Transformers in Medical Imaging: A Survey
Fahad Shamshad
Salman Khan
Syed Waqas Zamir
Muhammad Haris Khan
Munawar Hayat
F. Khan
H. Fu
ViT
LM&MA
MedIm
106
662
0
24 Jan 2022
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Improving Chest X-Ray Report Generation by Leveraging Warm Starting
Aaron Nicolson
Jason Dowling
Bevan Koopman
ViT
LM&MA
MedIm
17
90
0
24 Jan 2022
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Revisiting Weakly Supervised Pre-Training of Visual Perception Models
Mannat Singh
Laura Gustafson
Aaron B. Adcock
Vinicius de Freitas Reis
B. Gedik
Raj Prateek Kosaraju
D. Mahajan
Ross B. Girshick
Piotr Dollár
L. V. D. van der Maaten
VLM
32
122
0
20 Jan 2022
RePre: Improving Self-Supervised Vision Transformer with Reconstructive
  Pre-training
RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
Luyang Wang
Feng Liang
Yangguang Li
Honggang Zhang
Wanli Ouyang
Jing Shao
ViT
28
24
0
18 Jan 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit
  Vision Transformer
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yifan Jiang
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
25
45
0
17 Jan 2022
ViT2Hash: Unsupervised Information-Preserving Hashing
ViT2Hash: Unsupervised Information-Preserving Hashing
Qinkang Gong
Liangdao Wang
Hanjiang Lai
Yan Pan
Jian Yin
11
4
0
14 Jan 2022
Time Series Generation with Masked Autoencoder
Time Series Generation with Masked Autoencoder
Meng-yue Zha
SiuTim Wong
Mengqi Liu
Tong Zhang
Kani Chen
SyDa
AI4TS
27
17
0
14 Jan 2022
Pyramid Fusion Transformer for Semantic Segmentation
Pyramid Fusion Transformer for Semantic Segmentation
Zipeng Qin
Jianbo Liu
Xiaoling Zhang
Maoqing Tian
Aojun Zhou
Shuai Yi
Hongsheng Li
ViT
23
15
0
11 Jan 2022
Neuroplastic graph attention networks for nuclei segmentation in
  histopathology images
Neuroplastic graph attention networks for nuclei segmentation in histopathology images
Yoav Alon
Huiyu Zhou
GNN
22
3
0
10 Jan 2022
A ConvNet for the 2020s
A ConvNet for the 2020s
Zhuang Liu
Hanzi Mao
Chaozheng Wu
Christoph Feichtenhofer
Trevor Darrell
Saining Xie
ViT
40
4,967
0
10 Jan 2022
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors
  in MRI Images
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
Ali Hatamizadeh
V. Nath
Yucheng Tang
Dong Yang
H. Roth
Daguang Xu
ViT
MedIm
17
1,054
0
04 Jan 2022
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
SPViT: Enabling Faster Vision Transformers via Soft Token Pruning
Zhenglun Kong
Peiyan Dong
Xiaolong Ma
Xin Meng
Mengshu Sun
...
Geng Yuan
Bin Ren
Minghai Qin
H. Tang
Yanzhi Wang
ViT
26
141
0
27 Dec 2021
Multimodal Image Synthesis and Editing: The Generative AI Era
Multimodal Image Synthesis and Editing: The Generative AI Era
Fangneng Zhan
Yingchen Yu
Rongliang Wu
Jiahui Zhang
Shijian Lu
Lingjie Liu
Adam Kortylewski
Christian Theobalt
Eric Xing
EGVM
24
48
0
27 Dec 2021
SeMask: Semantically Masked Transformers for Semantic Segmentation
SeMask: Semantically Masked Transformers for Semantic Segmentation
Jitesh Jain
Anukriti Singh
Nikita Orlov
Zilong Huang
Jiachen Li
Steven Walton
Humphrey Shi
ViT
27
92
0
23 Dec 2021
SLIP: Self-supervision meets Language-Image Pre-training
SLIP: Self-supervision meets Language-Image Pre-training
Norman Mu
Alexander Kirillov
David A. Wagner
Saining Xie
VLM
CLIP
39
476
0
23 Dec 2021
Self-Supervised Graph Representation Learning for Neuronal Morphologies
Self-Supervised Graph Representation Learning for Neuronal Morphologies
Marissa A. Weis
Laura Hansel
Timo Lüddecke
Alexander S. Ecker
MedIm
23
7
0
23 Dec 2021
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Are Large-scale Datasets Necessary for Self-Supervised Pre-training?
Alaaeldin El-Nouby
Gautier Izacard
Hugo Touvron
Ivan Laptev
Hervé Jégou
Edouard Grave
SSL
13
148
0
20 Dec 2021
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
Wenbo Li
Xin Lu
Shengju Qian
Jiangbo Lu
X. Zhang
Jiaya Jia
ViT
29
83
0
19 Dec 2021
Pre-Training Transformers for Domain Adaptation
Pre-Training Transformers for Domain Adaptation
Burhan Ul Tayyab
Nicholas Chua
ViT
13
2
0
18 Dec 2021
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Masked Feature Prediction for Self-Supervised Visual Pre-Training
Chen Wei
Haoqi Fan
Saining Xie
Chaoxia Wu
Alan Yuille
Christoph Feichtenhofer
ViT
56
655
0
16 Dec 2021
Distilled Dual-Encoder Model for Vision-Language Understanding
Distilled Dual-Encoder Model for Vision-Language Understanding
Zekun Wang
Wenhui Wang
Haichao Zhu
Ming Liu
Bing Qin
Furu Wei
VLM
FedML
21
30
0
16 Dec 2021
FLAVA: A Foundational Language And Vision Alignment Model
FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh
Ronghang Hu
Vedanuj Goswami
Guillaume Couairon
Wojciech Galuba
Marcus Rohrbach
Douwe Kiela
CLIP
VLM
38
686
0
08 Dec 2021
Dilated convolution with learnable spacings
Dilated convolution with learnable spacings
Ismail Khalfaoui-Hassani
Thomas Pellegrini
T. Masquelier
8
31
0
07 Dec 2021
General Facial Representation Learning in a Visual-Linguistic Manner
General Facial Representation Learning in a Visual-Linguistic Manner
Yinglin Zheng
Hao Yang
Ting Zhang
Jianmin Bao
Dongdong Chen
Yangyu Huang
Lu Yuan
Dong Chen
Ming Zeng
Fang Wen
CVBM
138
163
0
06 Dec 2021
GETAM: Gradient-weighted Element-wise Transformer Attention Map for
  Weakly-supervised Semantic segmentation
GETAM: Gradient-weighted Element-wise Transformer Attention Map for Weakly-supervised Semantic segmentation
Weixuan Sun
Jing Zhang
Zheyuan Liu
Yiran Zhong
Nick Barnes
ViT
58
14
0
06 Dec 2021
BEVT: BERT Pretraining of Video Transformers
BEVT: BERT Pretraining of Video Transformers
Rui Wang
Dongdong Chen
Zuxuan Wu
Yinpeng Chen
Xiyang Dai
Mengchen Liu
Yu-Gang Jiang
Luowei Zhou
Lu Yuan
ViT
27
203
0
02 Dec 2021
Masked-attention Mask Transformer for Universal Image Segmentation
Masked-attention Mask Transformer for Universal Image Segmentation
Bowen Cheng
Ishan Misra
A. Schwing
Alexander Kirillov
Rohit Girdhar
ISeg
73
2,265
0
02 Dec 2021
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
MC-SSL0.0: Towards Multi-Concept Self-Supervised Learning
Sara Atito
Muhammad Awais
Ammarah Farooq
Zhenhua Feng
J. Kittler
15
17
0
30 Nov 2021
EdiBERT, a generative model for image editing
EdiBERT, a generative model for image editing
Thibaut Issenhuth
Ugo Tanielian
Jérémie Mary
David Picard
DiffM
27
12
0
30 Nov 2021
Pyramid Adversarial Training Improves ViT Performance
Pyramid Adversarial Training Improves ViT Performance
Charles Herrmann
Kyle Sargent
Lu Jiang
Ramin Zabih
Huiwen Chang
Ce Liu
Dilip Krishnan
Deqing Sun
ViT
18
56
0
30 Nov 2021
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
  Modeling
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Xumin Yu
Lulu Tang
Yongming Rao
Tiejun Huang
Jie Zhou
Jiwen Lu
3DPC
20
652
0
29 Nov 2021
Semantic-Aware Generation for Self-Supervised Visual Representation
  Learning
Semantic-Aware Generation for Self-Supervised Visual Representation Learning
Yunjie Tian
Lingxi Xie
Xiaopeng Zhang
Jiemin Fang
Haohang Xu
Wei Huang
Jianbin Jiao
Qi Tian
QiXiang Ye
SSL
GAN
28
16
0
25 Nov 2021
Layered Controllable Video Generation
Layered Controllable Video Generation
Jiahui Huang
Yuhe Jin
K. M. Yi
Leonid Sigal
VGen
17
11
0
24 Nov 2021
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
Baining Guo
ViT
37
238
0
24 Nov 2021
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token
  Modeling
VIOLET : End-to-End Video-Language Transformers with Masked Visual-token Modeling
Tsu-jui Fu
Linjie Li
Zhe Gan
Kevin Qinghong Lin
W. Wang
Lijuan Wang
Zicheng Liu
VLM
34
216
0
24 Nov 2021
ViCE: Improving Dense Representation Learning by Superpixelization and
  Contrasting Cluster Assignment
ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment
Robin Karlsson
Tomoki Hayashi
Keisuke Fujii
Alexander Carballo
Kento Ohtani
K. Takeda
SSL
34
4
0
24 Nov 2021
One to Transfer All: A Universal Transfer Framework for Vision
  Foundation Model with Few Data
One to Transfer All: A Universal Transfer Framework for Vision Foundation Model with Few Data
Yujie Wang
Junqin Huang
Mengya Gao
Yichao Wu
Zhen-fei Yin
Ding Liang
Junjie Yan
14
0
0
24 Nov 2021
Sparse Fusion for Multimodal Transformers
Sparse Fusion for Multimodal Transformers
Yi Ding
Alex Rich
Mason Wang
Noah Stier
M. Turk
P. Sen
Tobias Höllerer
ViT
27
7
0
23 Nov 2021
Benchmarking Detection Transfer Learning with Vision Transformers
Benchmarking Detection Transfer Learning with Vision Transformers
Yanghao Li
Saining Xie
Xinlei Chen
Piotr Dollar
Kaiming He
Ross B. Girshick
12
164
0
22 Nov 2021
Self-supervised Semi-supervised Learning for Data Labeling and Quality
  Evaluation
Self-supervised Semi-supervised Learning for Data Labeling and Quality Evaluation
Haoping Bai
Mengyao Cao
Ping-Chia Huang
Jiulong Shan
SSL
15
10
0
22 Nov 2021
Discrete Representations Strengthen Vision Transformer Robustness
Discrete Representations Strengthen Vision Transformer Robustness
Chengzhi Mao
Lu Jiang
Mostafa Dehghani
Carl Vondrick
Rahul Sukthankar
Irfan Essa
ViT
25
43
0
20 Nov 2021
SimMIM: A Simple Framework for Masked Image Modeling
SimMIM: A Simple Framework for Masked Image Modeling
Zhenda Xie
Zheng-Wei Zhang
Yue Cao
Yutong Lin
Jianmin Bao
Zhuliang Yao
Qi Dai
Han Hu
37
1,309
0
18 Nov 2021
Swin Transformer V2: Scaling Up Capacity and Resolution
Swin Transformer V2: Scaling Up Capacity and Resolution
Ze Liu
Han Hu
Yutong Lin
Zhuliang Yao
Zhenda Xie
...
Yue Cao
Zheng-Wei Zhang
Li Dong
Furu Wei
B. Guo
ViT
44
1,744
0
18 Nov 2021
TransMix: Attend to Mix for Vision Transformers
TransMix: Attend to Mix for Vision Transformers
Jieneng Chen
Shuyang Sun
Ju He
Philip H. S. Torr
Alan Yuille
S. Bai
ViT
17
103
0
18 Nov 2021
INTERN: A New Learning Paradigm Towards General Vision
INTERN: A New Learning Paradigm Towards General Vision
Jing Shao
Siyu Chen
Yangguang Li
Kun Wang
Zhen-fei Yin
...
F. Yu
Junjie Yan
Dahua Lin
Xiaogang Wang
Yu Qiao
8
34
0
16 Nov 2021
Previous
123...343536
Next