ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08254
  4. Cited By
BEiT: BERT Pre-Training of Image Transformers

BEiT: BERT Pre-Training of Image Transformers

15 June 2021
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
    ViT
ArXivPDFHTML

Papers citing "BEiT: BERT Pre-Training of Image Transformers"

50 / 1,790 papers shown
Title
Harmonized Spatial and Spectral Learning for Robust and Generalized
  Medical Image Segmentation
Harmonized Spatial and Spectral Learning for Robust and Generalized Medical Image Segmentation
Vandan Gorade
Sparsh Mittal
Debesh Jha
Rekha Singhal
Ulas Bagci
41
3
0
18 Jan 2024
Vision Mamba: Efficient Visual Representation Learning with
  Bidirectional State Space Model
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu
Bencheng Liao
Qian Zhang
Xinlong Wang
Wenyu Liu
Xinggang Wang
Mamba
50
710
0
17 Jan 2024
Scalable Pre-training of Large Autoregressive Image Models
Scalable Pre-training of Large Autoregressive Image Models
Alaaeldin El-Nouby
Michal Klein
Shuangfei Zhai
Miguel Angel Bautista
Alexander Toshev
Vaishaal Shankar
J. Susskind
Armand Joulin
VLM
33
71
0
16 Jan 2024
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized
  HD Map Construction
MapNeXt: Revisiting Training and Scaling Practices for Online Vectorized HD Map Construction
Toyota Li
26
6
0
14 Jan 2024
MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for
  Facial Expression Recognition
MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for Facial Expression Recognition
Fan Zhang
Xiaobao Guo
Xiaojiang Peng
Alex C. Kot
27
0
0
14 Jan 2024
NODI: Out-Of-Distribution Detection with Noise from Diffusion
NODI: Out-Of-Distribution Detection with Noise from Diffusion
Jingqiu Zhou
Aojun Zhou
Hongsheng Li
DiffM
29
1
0
13 Jan 2024
Optimization of Discrete Parameters Using the Adaptive Gradient Method
  and Directed Evolution
Optimization of Discrete Parameters Using the Adaptive Gradient Method and Directed Evolution
Andrei Beinarovich
Sergey Stepanov
Alexander Zaslavsky
46
0
0
12 Jan 2024
Seek for Incantations: Towards Accurate Text-to-Image Diffusion
  Synthesis through Prompt Engineering
Seek for Incantations: Towards Accurate Text-to-Image Diffusion Synthesis through Prompt Engineering
Chang Yu
Junran Peng
Xiangyu Zhu
Zhaoxiang Zhang
Qi Tian
Zhen Lei
DiffM
32
4
0
12 Jan 2024
A Study on Self-Supervised Pretraining for Vision Problems in
  Gastrointestinal Endoscopy
A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy
Edward Sanderson
B. Matuszewski
23
2
0
11 Jan 2024
Transformer-CNN Fused Architecture for Enhanced Skin Lesion Segmentation
Transformer-CNN Fused Architecture for Enhanced Skin Lesion Segmentation
Siddharth Tiwari
MedIm
ViT
45
0
0
10 Jan 2024
Revisiting Adversarial Training at Scale
Revisiting Adversarial Training at Scale
Zeyu Wang
Xianhang Li
Hongru Zhu
Cihang Xie
34
15
0
09 Jan 2024
Low-resource finetuning of foundation models beats state-of-the-art in
  histopathology
Low-resource finetuning of foundation models beats state-of-the-art in histopathology
Benedikt Roth
Valentin Koch
S. J. Wagner
Julia A. Schnabel
Carsten Marr
Tingying Peng
MedIm
21
8
0
09 Jan 2024
Skin Cancer Segmentation and Classification Using Vision Transformer for
  Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
Skin Cancer Segmentation and Classification Using Vision Transformer for Automatic Analysis in Dermatoscopy-based Non-invasive Digital System
Galib Muhammad Shahriar Himel
Md. Masudul Islam
Kh Abdullah Al-Aff
Shams Ibne Karim
Md. Kabir Uddin Sikder
MedIm
20
23
0
09 Jan 2024
Fully Attentional Networks with Self-emerging Token Labeling
Fully Attentional Networks with Self-emerging Token Labeling
Bingyin Zhao
Zhiding Yu
Shiyi Lan
Yutao Cheng
A. Anandkumar
Yingjie Lao
Jose M. Alvarez
980
6
0
08 Jan 2024
PIXAR: Auto-Regressive Language Modeling in Pixel Space
PIXAR: Auto-Regressive Language Modeling in Pixel Space
Yintao Tai
Xiyang Liao
Alessandro Suglia
Antonio Vergari
MLLM
26
7
0
06 Jan 2024
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes
  Interactively
Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively
Haobo Yuan
Xiangtai Li
Chong Zhou
Yining Li
Kai Chen
Chen Change Loy
VLM
29
51
0
05 Jan 2024
SPFormer: Enhancing Vision Transformer with Superpixel Representation
SPFormer: Enhancing Vision Transformer with Superpixel Representation
Jieru Mei
Liang-Chieh Chen
Alan L. Yuille
Cihang Xie
ViT
MDE
21
4
0
05 Jan 2024
Towards Weakly Supervised Text-to-Audio Grounding
Towards Weakly Supervised Text-to-Audio Grounding
Xuenan Xu
Ziyang Ma
Mengyue Wu
Kai Yu
AI4TS
33
9
0
05 Jan 2024
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for
  Multimodal Alignment
SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment
Ziping Ma
Furong Xu
Jian Liu
Ming Yang
Qingpei Guo
VLM
42
3
0
04 Jan 2024
Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN
  Ticket
Spikformer V2: Join the High Accuracy Club on ImageNet with an SNN Ticket
Zhaokun Zhou
Kaiwei Che
Wei Fang
Keyu Tian
Yuesheng Zhu
Shuicheng Yan
Yonghong Tian
Liuliang Yuan
ViT
41
28
0
04 Jan 2024
Few-shot Adaptation of Multi-modal Foundation Models: A Survey
Few-shot Adaptation of Multi-modal Foundation Models: A Survey
Fan Liu
Tianshu Zhang
Wenwen Dai
Wenwen Cai
Wenwen Cai Xiaocong Zhou
Delong Chen
VLM
OffRL
31
23
0
03 Jan 2024
Skeleton2vec: A Self-supervised Learning Framework with Contextualized
  Target Representations for Skeleton Sequence
Skeleton2vec: A Self-supervised Learning Framework with Contextualized Target Representations for Skeleton Sequence
Ruizhuo Xu
Linzhi Huang
Mei Wang
Jiani Hu
Weihong Deng
ViT
MedIm
35
1
0
01 Jan 2024
Masked Modeling for Self-supervised Representation Learning on Vision
  and Beyond
Masked Modeling for Self-supervised Representation Learning on Vision and Beyond
Siyuan Li
Luyuan Zhang
Zedong Wang
Di Wu
Lirong Wu
...
Jun-Xiong Xia
Cheng Tan
Yang Liu
Baigui Sun
Stan Z. Li
SSL
36
14
0
31 Dec 2023
SVFAP: Self-supervised Video Facial Affect Perceiver
SVFAP: Self-supervised Video Facial Affect Perceiver
Guoying Zhao
Zheng Lian
Kexin Wang
Yu He
Ming Xu
Haiyang Sun
Bin Liu
Jianhua Tao
56
14
0
31 Dec 2023
Morphing Tokens Draw Strong Masked Image Models
Morphing Tokens Draw Strong Masked Image Models
Taekyung Kim
Byeongho Heo
Dongyoon Han
54
3
0
30 Dec 2023
Self-supervised Pretraining for Decision Foundation Model: Formulation,
  Pipeline and Challenges
Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges
Xiaoqian Liu
Jianbin Jiao
Junge Zhang
OffRL
LRM
40
2
0
29 Dec 2023
FerKD: Surgical Label Adaptation for Efficient Distillation
FerKD: Surgical Label Adaptation for Efficient Distillation
Zhiqiang Shen
23
3
0
29 Dec 2023
An Empirical Study of Scaling Law for OCR
An Empirical Study of Scaling Law for OCR
Miao Rang
Zhenni Bi
Chuanjian Liu
Yunhe Wang
Kai Han
38
6
0
29 Dec 2023
Learning Vision from Models Rivals Learning Vision from Data
Learning Vision from Models Rivals Learning Vision from Data
Yonglong Tian
Lijie Fan
Kaifeng Chen
Dina Katabi
Dilip Krishnan
Phillip Isola
27
45
0
28 Dec 2023
Unsupervised Universal Image Segmentation
Unsupervised Universal Image Segmentation
Dantong Niu
Xudong Wang
Xinyang Han
Long Lian
Roei Herzig
Trevor Darrell
VLM
32
17
0
28 Dec 2023
BAL: Balancing Diversity and Novelty for Active Learning
BAL: Balancing Diversity and Novelty for Active Learning
Jingyao Li
Pengguang Chen
Shaozuo Yu
Shu-Lin Liu
Jiaya Jia
16
7
0
26 Dec 2023
Modality-Collaborative Transformer with Hybrid Feature Reconstruction
  for Robust Emotion Recognition
Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition
Chengxin Chen
Pengyuan Zhang
28
5
0
26 Dec 2023
emotion2vec: Self-Supervised Pre-Training for Speech Emotion
  Representation
emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation
Ziyang Ma
Zhisheng Zheng
Jiaxin Ye
Jinchao Li
Zhifu Gao
Shiliang Zhang
Xie Chen
MDE
SLR
SSL
25
86
0
23 Dec 2023
InternVL: Scaling up Vision Foundation Models and Aligning for Generic
  Visual-Linguistic Tasks
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen
Jiannan Wu
Wenhai Wang
Weijie Su
Guo Chen
...
Bin Li
Ping Luo
Tong Lu
Yu Qiao
Jifeng Dai
VLM
MLLM
176
924
0
21 Dec 2023
A Semantic Space is Worth 256 Language Descriptions: Make Stronger
  Segmentation Models with Descriptive Properties
A Semantic Space is Worth 256 Language Descriptions: Make Stronger Segmentation Models with Descriptive Properties
Junfei Xiao
Ziqi Zhou
Wenxuan Li
Shiyi Lan
Jieru Mei
Zhiding Yu
Alan L. Yuille
Yuyin Zhou
Cihang Xie
VLM
19
1
0
21 Dec 2023
Bootstrap Masked Visual Modeling via Hard Patches Mining
Bootstrap Masked Visual Modeling via Hard Patches Mining
Haochen Wang
Junsong Fan
Yuxi Wang
Kaiyou Song
Tiancai Wang
Xiangyu Zhang
Zhaoxiang Zhang
42
5
0
21 Dec 2023
Multimodal Federated Learning with Missing Modality via Prototype Mask
  and Contrast
Multimodal Federated Learning with Missing Modality via Prototype Mask and Contrast
Guangyin Bao
Qi Zhang
Duoqian Miao
Zixuan Gong
Liang Hu
Ke Liu
Yang Liu
Chongyang Shi
39
8
0
21 Dec 2023
Generative Multimodal Models are In-Context Learners
Generative Multimodal Models are In-Context Learners
Quan-Sen Sun
Yufeng Cui
Xiaosong Zhang
Fan Zhang
Qiying Yu
...
Yueze Wang
Yongming Rao
Jingjing Liu
Tiejun Huang
Xinlong Wang
MLLM
LRM
45
246
0
20 Dec 2023
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual
  Test-Time Adaptation
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu
Ran Xu
Senqiao Yang
Renrui Zhang
Qizhe Zhang
Zehui Chen
Yandong Guo
Shanghang Zhang
TTA
35
10
0
19 Dec 2023
Structural Information Guided Multimodal Pre-training for
  Vehicle-centric Perception
Structural Information Guided Multimodal Pre-training for Vehicle-centric Perception
Tianlin Li
Wentao Wu
Chenglong Li
Zhicheng Zhao
Zhe Chen
Yukai Shi
Jin Tang
46
4
0
15 Dec 2023
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards
  Universal Interpretation for Earth Observation Imagery
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
Xin Guo
Jiangwei Lao
Bo Dang
Yingying Zhang
Lei Yu
...
Jian Wang
Jingdong Chen
Ming Yang
Yongjun Zhang
Yansheng Li
36
117
0
15 Dec 2023
SeiT++: Masked Token Modeling Improves Storage-efficient Training
SeiT++: Masked Token Modeling Improves Storage-efficient Training
Min-Seob Lee
Song Park
Byeongho Heo
Dongyoon Han
Hyunjung Shim
MQ
VLM
26
1
0
15 Dec 2023
VL-GPT: A Generative Pre-trained Transformer for Vision and Language
  Understanding and Generation
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation
Jinguo Zhu
Xiaohan Ding
Yixiao Ge
Yuying Ge
Sijie Zhao
Hengshuang Zhao
Xiaohua Wang
Ying Shan
ViT
VLM
16
32
0
14 Dec 2023
Impact of Ground Truth Quality on Handwriting Recognition
Impact of Ground Truth Quality on Handwriting Recognition
Michael Jungo
Lars Vogtlin
Atefeh Fakhari
Nathan Wegmann
Rolf Ingold
Andreas Fischer
A. Scius-Bertrand
13
0
0
14 Dec 2023
Semi-supervised Semantic Segmentation Meets Masked Modeling:Fine-grained
  Locality Learning Matters in Consistency Regularization
Semi-supervised Semantic Segmentation Meets Masked Modeling:Fine-grained Locality Learning Matters in Consistency Regularization
W. Pan
Zhe Xu
Jiangpeng Yan
Zihan Wu
R. Tong
Xiu Li
Jianhua Yao
ISeg
28
1
0
14 Dec 2023
Black-box Membership Inference Attacks against Fine-tuned Diffusion
  Models
Black-box Membership Inference Attacks against Fine-tuned Diffusion Models
Yan Pang
Tianhao Wang
27
18
0
13 Dec 2023
PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for
  Infrared Images
PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images
Tao Zhang
Kun Ding
Jinyong Wen
Yu Xiong
Zeyu Zhang
Shiming Xiang
Chunhong Pan
27
3
0
13 Dec 2023
Learned representation-guided diffusion models for large-image
  generation
Learned representation-guided diffusion models for large-image generation
Alexandros Graikos
Srikar Yellapragada
Minh-Quan Le
S. Kapse
Prateek Prasanna
Joel H. Saltz
Dimitris Samaras
DiffM
37
27
0
12 Dec 2023
Domain Prompt Learning with Quaternion Networks
Domain Prompt Learning with Quaternion Networks
Qinglong Cao
Zhengqin Xu
Yuntian Chen
Chao Ma
Xiaokang Yang
VLM
39
10
0
12 Dec 2023
Building Universal Foundation Models for Medical Image Analysis with
  Spatially Adaptive Networks
Building Universal Foundation Models for Medical Image Analysis with Spatially Adaptive Networks
Lingxiao Luo
Xuanzhong Chen
Bingda Tang
Xinsheng Chen
Rong Han
Chengpeng Hu
Yujiang Li
Ting Chen
MedIm
26
2
0
12 Dec 2023
Previous
123...111213...343536
Next