ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08254
  4. Cited By
BEiT: BERT Pre-Training of Image Transformers

BEiT: BERT Pre-Training of Image Transformers

15 June 2021
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
    ViT
ArXivPDFHTML

Papers citing "BEiT: BERT Pre-Training of Image Transformers"

50 / 1,788 papers shown
Title
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders
Paul Koch
Jörg Krüger
Ankit Chowdhury
O. Heimann
MDE
55
0
0
25 Mar 2025
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals
Stefan Stojanov
David Wendt
Seungwoo Kim
R. Venkatesh
Kevin T. Feigelis
Jiajun Wu
Daniel L. K. Yamins
SSL
71
0
0
25 Mar 2025
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
44
0
0
25 Mar 2025
Your ViT is Secretly an Image Segmentation Model
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
59
1
0
24 Mar 2025
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition
Yifei Zhang
Chang-Shu Liu
Jin Wei
Xiaomeng Yang
Yu Zhou
Can Ma
Xiangyang Ji
60
2
0
24 Mar 2025
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
VTD-CLIP: Video-to-Text Discretization via Prompting CLIP
Wencheng Zhu
Yuexin Wang
Hongxuan Li
Pengfei Zhu
Q. Hu
CLIP
48
0
0
24 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
57
0
0
21 Mar 2025
Halton Scheduler For Masked Generative Image Transformer
Halton Scheduler For Masked Generative Image Transformer
Victor Besnier
Mickael Chen
David Hurych
Eduardo Valle
Matthieu Cord
52
1
0
21 Mar 2025
Audio-Enhanced Vision-Language Modeling with Latent Space Broadening for High Quality Data Expansion
Audio-Enhanced Vision-Language Modeling with Latent Space Broadening for High Quality Data Expansion
Yu Sun
Yin Li
R.-H. Sun
Chunhui Liu
Fangming Zhou
Ze Jin
Linjie Wang
Xiang Shen
Zhuolin Hao
Hongyu Xiong
VLM
48
0
0
21 Mar 2025
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection
Gensheng Pei
Tao Chen
Yujia Wang
Xinhao Cai
Xiangbo Shu
Tianfei Zhou
Yazhou Yao
VLM
53
1
0
21 Mar 2025
Structured-Noise Masked Modeling for Video, Audio and Beyond
Structured-Noise Masked Modeling for Video, Audio and Beyond
Aritra Bhowmik
Fida Mohammad Thoker
Carlos Hinojosa
Bernard Ghanem
Cees G. M. Snoek
VGen
59
0
0
20 Mar 2025
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Imanol G. Estepa
Jesús M. Rodríguez-de-Vera
Ignacio Sarasúa
Bhalaji Nagarajan
P. Radeva
54
0
0
19 Mar 2025
Utilization of Neighbor Information for Image Classification with Different Levels of Supervision
Utilization of Neighbor Information for Image Classification with Different Levels of Supervision
Gihan Jayatilaka
Abhinav Shrivastava
M. Gwilliam
64
0
0
18 Mar 2025
Quantum EigenGame for excited state calculation
Quantum EigenGame for excited state calculation
David Quiroga
Jason Han
Anastasios Kyrillidis
53
1
0
17 Mar 2025
8-Calves Image dataset
8-Calves Image dataset
Xuyang Fang
S. Hannuna
Neill D. F. Campbell
113
0
0
17 Mar 2025
MAVEN: Multi-modal Attention for Valence-Arousal Emotion Network
MAVEN: Multi-modal Attention for Valence-Arousal Emotion Network
Vrushank Ahire
Kunal Shah
Mudasir Nazir Khan
Nikhil Pakhale
L. Sookha
M. A. Ganaie
Abhinav Dhall
65
0
0
16 Mar 2025
Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation
Shape Bias and Robustness Evaluation via Cue Decomposition for Image Classification and Segmentation
Edgar Heinert
Thomas Gottwald
Annika Mütze
Matthias Rottmann
62
0
0
16 Mar 2025
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Joona Kareinen
T. Eerola
K. Kraft
L. Lensu
S. Suikkanen
H. Kalviainen
SSL
148
0
0
14 Mar 2025
Unlocking Open-Set Language Accessibility in Vision Models
Fawaz Sammani
Jonas Fischer
Nikos Deligiannis
VLM
53
0
0
14 Mar 2025
RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing
Fengxiang Wang
H. Wang
Y. Wang
Di Wang
Mingshuo Chen
...
Yangang Sun
Shuo Wang
L. Lan
Wenjing Yang
Jing Zhang
Mamba
77
3
0
13 Mar 2025
Isolated Channel Vision Transformers: From Single-Channel Pretraining to Multi-Channel Finetuning
Wenyi Lian
Joakim Lindblad
Patrick Micke
Natasa Sladoje
62
0
0
12 Mar 2025
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
Hariprasath Govindarajan
Maciej K. Wozniak
Marvin Klingner
Camille Maurice
B. R. Kiran
S. Yogamani
53
0
0
12 Mar 2025
Effective and Efficient Masked Image Generation Models
Effective and Efficient Masked Image Generation Models
Zebin You
Jingyang Ou
Xiaolu Zhang
Jun Hu
Jun Zhou
Chongxuan Li
DiffM
VLM
54
1
0
10 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
41
0
0
10 Mar 2025
Alligat0R: Pre-Training Through Co-Visibility Segmentation for Relative Camera Pose Regression
Thibaut Loiseau
Guillaume Bourmaud
Vincent Lepetit
64
0
0
10 Mar 2025
Brain Inspired Adaptive Memory Dual-Net for Few-Shot Image Classification
Kexin Di
Xiuxing Li
Yuyang Han
Ziyu Li
Qing Li
Xia Wu
VLM
53
0
0
10 Mar 2025
CLICv2: Image Complexity Representation via Content Invariance Contrastive Learning
Shipeng Liu
Liang Zhao
Dengfeng Chen
SSL
96
0
0
09 Mar 2025
USP: Unified Self-Supervised Pretraining for Image Generation and Understanding
Xiangxiang Chu
Renda Li
Yong Wang
60
0
0
08 Mar 2025
Find First, Track Next: Decoupling Identification and Propagation in Referring Video Object Segmentation
Suhwan Cho
Seunghoon Lee
Minhyeok Lee
Jungho Lee
Sangyoun Lee
VOS
77
0
0
05 Mar 2025
Task-Agnostic Attacks Against Vision Foundation Models
Brian Pulfer
Yury Belousov
Vitaliy Kinakh
Teddy Furon
S. Voloshynovskiy
AAML
68
0
0
05 Mar 2025
Is Pre-training Applicable to the Decoder for Dense Prediction?
Is Pre-training Applicable to the Decoder for Dense Prediction?
Chao Ning
Wanshui Gan
Weihao Xuan
Naoto Yokoya
48
0
0
05 Mar 2025
DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering
Jingzhou Luo
Y. Liu
Weixing Chen
Zhen Li
Y. Wang
G. Li
Liang Lin
65
2
0
05 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
75
0
0
04 Mar 2025
Wavelet-Driven Masked Image Modeling: A Path to Efficient Visual Representation
Wenzhao Xiang
Chang Liu
Hongyang Yu
Xilin Chen
31
0
0
02 Mar 2025
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
Tianyi Wang
Jianan Fan
Dingxin Zhang
Dongnan Liu
Yong-quan Xia
Heng Huang
Weidong Cai
36
0
0
01 Mar 2025
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
42
0
0
28 Feb 2025
Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds
Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds
Mohamed Abdelsamad
Michael Ulrich
Claudius Gläser
Abhinav Valada
3DPC
42
0
0
27 Feb 2025
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction
Siyu Jiao
Gengwei Zhang
Yinlong Qian
Jiancheng Huang
Yao Zhao
Humphrey Shi
Lin Ma
Y. X. Wei
Zequn Jie
VLM
39
1
0
27 Feb 2025
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Carlos Vélez García
Miguel Cazorla
Jorge Pomares
49
0
0
25 Feb 2025
Arrhythmia Classification from 12-Lead ECG Signals Using Convolutional and Transformer-Based Deep Learning Models
Arrhythmia Classification from 12-Lead ECG Signals Using Convolutional and Transformer-Based Deep Learning Models
Andrei Apostol
Maria Nutu
60
0
0
25 Feb 2025
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
65
8
0
24 Feb 2025
Graph Perceiver IO: A General Architecture for Graph Structured Data
Graph Perceiver IO: A General Architecture for Graph Structured Data
Seyun Bae
Hoyoon Byun
Changdae Oh
Yoon-Sik Cho
Kyungwoo Song
GNN
92
2
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
58
43
0
24 Feb 2025
Tight Clusters Make Specialized Experts
Tight Clusters Make Specialized Experts
Stefan K. Nielsen
R. Teo
Laziz U. Abdullaev
Tan M. Nguyen
MoE
59
2
0
21 Feb 2025
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning
Yongqi Dong
Xingmin Lu
Ruohan Li
Wei Song
B. Arem
Haneen Farah
ViT
105
1
0
21 Feb 2025
Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
Intensity-Spatial Dual Masked Autoencoder for Multi-Scale Feature Learning in Chest CT Segmentation
Yuexing Ding
Jun Wang
H. Lyu
86
0
0
17 Feb 2025
Hyperspherical Energy Transformer with Recurrent Depth
Yunzhe Hu
Difan Zou
Dong Xu
41
0
0
17 Feb 2025
Harnessing Vision Models for Time Series Analysis: A Survey
Harnessing Vision Models for Time Series Analysis: A Survey
Jingchao Ni
Ziming Zhao
ChengAo Shen
Hanghang Tong
Dongjin Song
Wei Cheng
Dongsheng Luo
Haifeng Chen
AI4TS
77
1
0
13 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
101
0
0
12 Feb 2025
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
From Pixels to Components: Eigenvector Masking for Visual Representation Learning
Alice Bizeul
Thomas M. Sutter
Alain Ryser
Bernhard Schölkopf
Julius von Kügelgen
Julia E. Vogt
86
1
0
10 Feb 2025
Previous
12345...343536
Next