ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2106.08254
  4. Cited By
BEiT: BERT Pre-Training of Image Transformers

BEiT: BERT Pre-Training of Image Transformers

15 June 2021
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
    ViT
ArXivPDFHTML

Papers citing "BEiT: BERT Pre-Training of Image Transformers"

50 / 1,788 papers shown
Title
Visual Representation Learning with Stochastic Frame Prediction
Visual Representation Learning with Stochastic Frame Prediction
Huiwon Jang
Dongyoung Kim
Junsu Kim
Jinwoo Shin
Pieter Abbeel
Younggyo Seo
39
2
0
11 Jun 2024
A Comparative Survey of Vision Transformers for Feature Extraction in
  Texture Analysis
A Comparative Survey of Vision Transformers for Feature Extraction in Texture Analysis
Leonardo F. S. Scabini
Andre Sacilotti
Kallil M. C. Zielinski
L. C. Ribas
B. De Baets
Odemir M. Bruno
ViT
33
3
0
10 Jun 2024
An Open and Large-Scale Dataset for Multi-Modal Climate Change-aware
  Crop Yield Predictions
An Open and Large-Scale Dataset for Multi-Modal Climate Change-aware Crop Yield Predictions
Fudong Lin
Kaleb Guillot
Summer Crawford
Yihe Zhang
Xu Yuan
Nian-Feng Tzeng
AI4Cl
AI4CE
33
5
0
10 Jun 2024
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor
  Control
Adapting Pretrained ViTs with Convolution Injector for Visuo-Motor Control
Dongyoon Hwang
ByungKun Lee
Hojoon Lee
Hyunseung Kim
Jaegul Choo
53
0
0
10 Jun 2024
Gentle-CLIP: Exploring Aligned Semantic In Low-Quality Multimodal Data
  With Soft Alignment
Gentle-CLIP: Exploring Aligned Semantic In Low-Quality Multimodal Data With Soft Alignment
Zijia Song
Z. Zang
Yelin Wang
Guozheng Yang
Jiangbin Zheng
Kaicheng Yu
Wanyu Chen
Stan Z. Li
33
0
0
09 Jun 2024
Particle Multi-Axis Transformer for Jet Tagging
Particle Multi-Axis Transformer for Jet Tagging
Muhammad Usman
M. Shahid
Maheen Ejaz
Ummay Hani
Nayab Fatima
Abdul Rehman Khan
Asifullah Khan
Nasir Majid Mirza
33
3
0
09 Jun 2024
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis
Zanlin Ni
Yulin Wang
Renping Zhou
Jiayi Guo
Jinyi Hu
Zhiyuan Liu
Shiji Song
Yuan Yao
Gao Huang
32
14
0
08 Jun 2024
Nomic Embed Vision: Expanding the Latent Space
Nomic Embed Vision: Expanding the Latent Space
Zach Nussbaum
Brandon Duderstadt
Andriy Mulyar
VLM
33
5
0
06 Jun 2024
Parameter-Inverted Image Pyramid Networks
Parameter-Inverted Image Pyramid Networks
Xizhou Zhu
Xue Yang
Zhaokai Wang
Hao Li
Wenhan Dou
Junqi Ge
Lewei Lu
Yu Qiao
Jifeng Dai
47
0
0
06 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
48
6
0
06 Jun 2024
Cooperative learning of Pl@ntNet's Artificial Intelligence algorithm:
  how does it work and how can we improve it?
Cooperative learning of Pl@ntNet's Artificial Intelligence algorithm: how does it work and how can we improve it?
Tanguy Lefort
Antoine Affouard
Benjamin Charlier
J. Lombardo
Mathias Chouet
Hervé Goëau
Joseph Salmon
P. Bonnet
Alexis Joly
37
0
0
05 Jun 2024
Enhancing 2D Representation Learning with a 3D Prior
Enhancing 2D Representation Learning with a 3D Prior
Mehmet Aygun
Prithviraj Dhar
Zhicheng Yan
Oisin Mac Aodha
Rakesh Ranjan
SSL
56
1
0
04 Jun 2024
An Empirical Study into Clustering of Unseen Datasets with
  Self-Supervised Encoders
An Empirical Study into Clustering of Unseen Datasets with Self-Supervised Encoders
Scott C. Lowe
Joakim Bruslund Haurum
Sageev Oore
T. Moeslund
Graham W. Taylor
SSL
46
3
0
04 Jun 2024
Audio Mamba: Selective State Spaces for Self-Supervised Audio
  Representations
Audio Mamba: Selective State Spaces for Self-Supervised Audio Representations
Sarthak Yadav
Z. Tan
Mamba
37
10
0
04 Jun 2024
Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo
  Labels for PVUW2024
Semi-supervised Video Semantic Segmentation Using Unreliable Pseudo Labels for PVUW2024
Biao Wu
Diankai Zhang
Sihan Gao
Cheng-yong Zheng
Shaoli Liu
Ning Wang
35
0
0
02 Jun 2024
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention
  Mechanisms in Medical Caption Generation through Concept Detection
  Integration
DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration
Nhi Ngoc-Yen Nguyen
Le-Huy Tu
Dieu-Phuong Nguyen
Nhat-Tan Do
Minh Triet Thai
Bao-Thien Nguyen-Tat
MedIm
29
1
0
01 Jun 2024
Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Shiyin Lu
Yang Li
Qing-Guo Chen
Zhao Xu
Weihua Luo
Kaifu Zhang
Han-Jia Ye
VLM
MLLM
53
35
0
31 May 2024
FinGen: A Dataset for Argument Generation in Finance
FinGen: A Dataset for Argument Generation in Finance
Chung-Chi Chen
Hiroya Takamura
Ichiro Kobayashi
Yusuke Miyao
31
0
0
31 May 2024
Is Synthetic Data all We Need? Benchmarking the Robustness of Models
  Trained with Synthetic Images
Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images
Krishnakant Singh
Thanush Navaratnam
Jannik Holmer
Simone Schaub-Meyer
Stefan Roth
DiffM
44
18
0
30 May 2024
Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification
Distribution Aligned Semantics Adaption for Lifelong Person Re-Identification
Qizao Wang
Xuelin Qian
Bin Li
Xiangyang Xue
24
1
0
30 May 2024
Neural Isometries: Taming Transformations for Equivariant ML
Neural Isometries: Taming Transformations for Equivariant ML
Thomas W. Mitchel
Michael Taylor
Vincent Sitzmann
28
0
0
29 May 2024
ContextBLIP: Doubly Contextual Alignment for Contrastive Image Retrieval
  from Linguistically Complex Descriptions
ContextBLIP: Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions
Honglin Lin
Siyu Li
Gu Nan
Chaoyue Tang
Xueting Wang
...
Yankai Rong
Zhili Zhou
Yutong Gao
Qimei Cui
Xiaofeng Tao
25
0
0
29 May 2024
Enhancing Vision-Language Model with Unmasked Token Alignment
Enhancing Vision-Language Model with Unmasked Token Alignment
Jihao Liu
Jinliang Zheng
Boxiao Liu
Yu Liu
Hongsheng Li
CLIP
24
0
0
29 May 2024
MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning
MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning
Junjie Wang
Guangjing Yang
Wentao Chen
Huahui Yi
Xiaohu Wu
Qicheng Lao
MoE
ALM
36
0
0
29 May 2024
Large Brain Model for Learning Generic Representations with Tremendous
  EEG Data in BCI
Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI
Wei-Bang Jiang
Li-Ming Zhao
Bao-Liang Lu
35
67
0
29 May 2024
MEGA: Masked Generative Autoencoder for Human Mesh Recovery
MEGA: Masked Generative Autoencoder for Human Mesh Recovery
Guénolé Fiche
Simon Leglaive
Xavier Alameda-Pineda
Francesc Moreno-Noguer
3DH
60
1
0
29 May 2024
SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder
  for Self-Supervised Landmark Estimation
SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation
Kejia Yin
Varshanth R. Rao
R. Jiang
Xudong Liu
P. Aarabi
David B. Lindell
51
0
0
28 May 2024
In-Context Symmetries: Self-Supervised Learning through Contextual World
  Models
In-Context Symmetries: Self-Supervised Learning through Contextual World Models
Sharut Gupta
Chenyu Wang
Yifei Wang
Tommi Jaakkola
Stefanie Jegelka
32
1
0
28 May 2024
Visualizing the loss landscape of Self-supervised Vision Transformer
Visualizing the loss landscape of Self-supervised Vision Transformer
Youngwan Lee
Jeffrey Willette
Jonghee Kim
Sung Ju Hwang
ViT
38
1
0
28 May 2024
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive
  Architecture
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture
Shentong Mo
Sukmin Yun
42
3
0
28 May 2024
BehaviorGPT: Smart Agent Simulation for Autonomous Driving with
  Next-Patch Prediction
BehaviorGPT: Smart Agent Simulation for Autonomous Driving with Next-Patch Prediction
Zikang Zhou
Haibo Hu
Xinhong Chen
Jianping Wang
Nan Guan
Kui Wu
Yung-Hui Li
Yu-Kai Huang
Chun Jason Xue
AI4CE
41
17
0
27 May 2024
LCM: Locally Constrained Compact Point Cloud Model for Masked Point
  Modeling
LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling
Yaohua Zha
Naiqi Li
Yanzi Wang
Tao Dai
Hang Guo
Bin Chen
Zhi Wang
Zhihao Ouyang
Shu-Tao Xia
Mamba
42
8
0
27 May 2024
ARVideo: Autoregressive Pretraining for Self-Supervised Video
  Representation Learning
ARVideo: Autoregressive Pretraining for Self-Supervised Video Representation Learning
Sucheng Ren
Hongru Zhu
Chen Wei
Yijiang Li
Alan L. Yuille
Cihang Xie
AI4TS
VGen
SSL
53
1
0
24 May 2024
What Variables Affect Out-Of-Distribution Generalization in Pretrained
  Models?
What Variables Affect Out-Of-Distribution Generalization in Pretrained Models?
Md Yousuf Harun
Kyungbok Lee
Jhair Gallardo
Giri Krishnan
Christopher Kanan
33
3
0
23 May 2024
A Lost Opportunity for Vision-Language Models: A Comparative Study of
  Online Test-time Adaptation for Vision-Language Models
A Lost Opportunity for Vision-Language Models: A Comparative Study of Online Test-time Adaptation for Vision-Language Models
Mario Döbler
Robert A. Marsden
Tobias Raichle
Bin Yang
VLM
29
5
0
23 May 2024
Mamba-R: Vision Mamba ALSO Needs Registers
Mamba-R: Vision Mamba ALSO Needs Registers
Feng Wang
Jiahao Wang
Sucheng Ren
Guoyizhe Wei
Jieru Mei
Wei Shao
Yuyin Zhou
Alan L. Yuille
Cihang Xie
Mamba
36
20
0
23 May 2024
Time-FFM: Towards LM-Empowered Federated Foundation Model for Time
  Series Forecasting
Time-FFM: Towards LM-Empowered Federated Foundation Model for Time Series Forecasting
Qingxiang Liu
Xu Liu
Chenghao Liu
Qingsong Wen
Yuxuan Liang
AI4TS
AI4CE
48
6
0
23 May 2024
Configuring Data Augmentations to Reduce Variance Shift in Positional
  Embedding of Vision Transformers
Configuring Data Augmentations to Reduce Variance Shift in Positional Embedding of Vision Transformers
Bum Jun Kim
Sang Woo Kim
ViT
41
1
0
23 May 2024
Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations
Harmony: A Joint Self-Supervised and Weakly-Supervised Framework for Learning General Purpose Visual Representations
Mohammed Baharoon
Jonathan Klein
D. L. Michels
SSL
VLM
41
0
0
23 May 2024
LookHere: Vision Transformers with Directed Attention Generalize and
  Extrapolate
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
A. Fuller
Daniel G. Kyrollos
Yousef Yassin
James R. Green
46
2
0
22 May 2024
BIMM: Brain Inspired Masked Modeling for Video Representation Learning
BIMM: Brain Inspired Masked Modeling for Video Representation Learning
Zhifan Wan
Jie M. Zhang
Chang-bo Li
Shiguang Shan
69
0
0
21 May 2024
Du-IN: Discrete units-guided mask modeling for decoding speech from
  Intracranial Neural signals
Du-IN: Discrete units-guided mask modeling for decoding speech from Intracranial Neural signals
Hui Zheng
Haiteng Wang
Wei-Bang Jiang
Zhongtao Chen
Li He
Pei-Yang Lin
Peng-Hu Wei
Guo-Guang Zhao
Yun-Zhe Liu
50
1
0
19 May 2024
DINO as a von Mises-Fisher mixture model
DINO as a von Mises-Fisher mixture model
Hariprasath Govindarajan
Per Sidén
Jacob Roll
Fredrik Lindsten
42
11
0
17 May 2024
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon: Mixed-Modal Early-Fusion Foundation Models
Chameleon Team
MLLM
62
255
0
16 May 2024
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Oncel Tuzel
VLM
CLIP
35
6
0
14 May 2024
Efficient Vision-Language Pre-training by Cluster Masking
Efficient Vision-Language Pre-training by Cluster Masking
Zihao Wei
Zixuan Pan
Andrew Owens
VLM
29
8
0
14 May 2024
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual
  Backbone Training
EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training
Yulin Wang
Yang Yue
Rui Lu
Yizeng Han
Shiji Song
Gao Huang
VLM
56
12
0
14 May 2024
MambaOut: Do We Really Need Mamba for Vision?
MambaOut: Do We Really Need Mamba for Vision?
Weihao Yu
Xinchao Wang
Mamba
45
48
0
13 May 2024
Open Challenges and Opportunities in Federated Foundation Models Towards
  Biomedical Healthcare
Open Challenges and Opportunities in Federated Foundation Models Towards Biomedical Healthcare
Xingyu Li
Lu Peng
Yuping Wang
Weihua Zhang
AI4CE
MedIm
LM&MA
71
5
0
10 May 2024
MaskMatch: Boosting Semi-Supervised Learning Through Mask
  Autoencoder-Driven Feature Learning
MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning
Wenjin Zhang
Keyi Li
Sen Yang
Chenyang Gao
Wanzhao Yang
Sifan Yuan
I. Marsic
33
1
0
10 May 2024
Previous
123...789...343536
Next