ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2111.07832
  4. Cited By
iBOT: Image BERT Pre-Training with Online Tokenizer
v1v2v3 (latest)

iBOT: Image BERT Pre-Training with Online Tokenizer

15 November 2021
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
ArXiv (abs)PDFHTML

Papers citing "iBOT: Image BERT Pre-Training with Online Tokenizer"

50 / 607 papers shown
Title
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised
  Learning of Visual Features
Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features
T. Silva
Hélio Pedrini
Adín Ramírez Rivera
SSL
163
6
0
03 Jul 2024
Multi-Grained Contrast for Data-Efficient Unsupervised Representation
  Learning
Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning
Chengchao Shen
Jianzhong Chen
Jianxin Wang
SSL
251
3
0
02 Jul 2024
Foundational Models for Pathology and Endoscopy Images: Application for
  Gastric Inflammation
Foundational Models for Pathology and Endoscopy Images: Application for Gastric Inflammation
H. Kerdegari
Kyle Higgins
Dennis Veselkov
I. Laponogov
I. Poļaka
...
Junior Andrea Pescino
M. Leja
M. Dinis-Ribeiro
T. F. Kanonnikoff
Kirill Veselkov
400
6
0
26 Jun 2024
3D-MVP: 3D Multiview Pretraining for Robotic Manipulation
3D-MVP: 3D Multiview Pretraining for Robotic Manipulation
Shengyi Qian
Kaichun Mo
Valts Blukis
David Fouhey
Dieter Fox
Ankit Goyal
184
6
0
26 Jun 2024
Investigating Self-Supervised Methods for Label-Efficient Learning
Investigating Self-Supervised Methods for Label-Efficient Learning
S. Nandam
Sara Atito
Zhenhua Feng
Josef Kittler
Muhammad Awais
VLM
150
2
0
25 Jun 2024
Pseudo Labelling for Enhanced Masked Autoencoders
Pseudo Labelling for Enhanced Masked Autoencoders
S. Nandam
Sara Atito
Zhenhua Feng
Josef Kittler
Muhammad Awais
149
1
0
25 Jun 2024
Masked Generative Extractor for Synergistic Representation and 3D
  Generation of Point Clouds
Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds
Hongliang Zeng
Ping Zhang
Fang Li
Jiahua Wang
Tingyu Ye
Pengteng Guo
3DPC
317
1
0
25 Jun 2024
HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image
  Analysis
HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis
Guillaume Jaume
Paul Doucet
Andrew H. Song
Ming Y. Lu
Cristina Almagro-Pérez
...
Anurag J. Vaidya
Richard J. Chen
Drew F. K. Williamson
Ahrong Kim
Faisal Mahmood
312
79
0
23 Jun 2024
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
Thomas Stegmüller
Tim Lebailly
Nikola Dukic
Behzad Bozorgtabar
Tinne Tuytelaars
Jean-Philippe Thiran
VLM
399
3
0
23 Jun 2024
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Duoduo CLIP: Efficient 3D Understanding with Multi-View Images
Han-Hung Lee
Yiming Zhang
Angel X. Chang
3DPC
495
4
0
17 Jun 2024
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts
Samar Khanna
Medhanie Irgau
David B. Lobell
Stefano Ermon
VLM
631
14
0
16 Jun 2024
SemanticMIM: Marring Masked Image Modeling with Semantics Compression
  for General Visual Representation
SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation
Yike Yuan
Huanzhang Dou
Fengjun Guo
Xi Li
222
2
0
15 Jun 2024
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
Roman Bachmann
Oğuzhan Fatih Kar
David Mizrahi
Ali Garjani
Mingfei Gao
David Griffiths
Jiaming Hu
Afshin Dehghan
Amir Zamir
MoEVLMMLLM
246
33
0
13 Jun 2024
Image and Video Tokenization with Binary Spherical Quantization
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao
Yuanjun Xiong
Philipp Krahenbuhl
228
57
0
11 Jun 2024
Let Go of Your Labels with Unsupervised Transfer
Let Go of Your Labels with Unsupervised Transfer
Artyom Gadetsky
Yulun Jiang
Maria Brbić
VLM
223
12
0
11 Jun 2024
SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale
SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale
Shester Gueuwou
Xiaodan Du
G. Shakhnarovich
Karen Livescu
SLR
316
9
0
11 Jun 2024
The 3D-PC: a benchmark for visual perspective taking in humans and machines
The 3D-PC: a benchmark for visual perspective taking in humans and machines
Drew Linsley
Peisen Zhou
A. Ashok
Akash Nagaraj
Gaurav Gaonkar
Francis E Lewis
Zygmunt Pizlo
Thomas Serre
394
9
0
06 Jun 2024
Enhancing 2D Representation Learning with a 3D Prior
Enhancing 2D Representation Learning with a 3D Prior
Mehmet Aygun
Prithviraj Dhar
Zhicheng Yan
Oisin Mac Aodha
Rakesh Ranjan
SSL
186
1
0
04 Jun 2024
An Empirical Study into Clustering of Unseen Datasets with
  Self-Supervised Encoders
An Empirical Study into Clustering of Unseen Datasets with Self-Supervised Encoders
Scott C. Lowe
Joakim Bruslund Haurum
Sageev Oore
T. Moeslund
Graham W. Taylor
SSL
219
5
0
04 Jun 2024
Scaling Up Deep Clustering Methods Beyond ImageNet-1K
Scaling Up Deep Clustering Methods Beyond ImageNet-1K
Tim Kaiser
Félix D. P. Michels
Kaspar Senft
Diana Petrusheva
M. Kollmann
251
2
0
03 Jun 2024
MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign
  Language Recognition
MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition
Weichao Zhao
Hezhen Hu
Wen-gang Zhou
Yunyao Mao
Min Wang
Houqiang Li
SLR
186
20
0
31 May 2024
Multi-Label Guided Soft Contrastive Learning for Efficient Earth
  Observation Pretraining
Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining
Yi Wang
C. Albrecht
Xiao Xiang Zhu
267
15
0
30 May 2024
MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning
MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning
Junjie Wang
Guangjing Yang
Wentao Chen
Huahui Yi
Xiaohu Wu
Qicheng Lao
MoEALM
247
0
0
29 May 2024
In-Context Symmetries: Self-Supervised Learning through Contextual World
  Models
In-Context Symmetries: Self-Supervised Learning through Contextual World Models
Sharut Gupta
Chenyu Wang
Yifei Wang
Tommi Jaakkola
Stefanie Jegelka
211
5
0
28 May 2024
Visualizing the loss landscape of Self-supervised Vision Transformer
Visualizing the loss landscape of Self-supervised Vision Transformer
Youngwan Lee
Jeffrey Willette
Jonghee Kim
Sung Ju Hwang
ViT
156
1
0
28 May 2024
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive
  Architecture
DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture
Shentong Mo
Sukmin Yun
197
3
0
28 May 2024
Recasting Generic Pretrained Vision Transformers As Object-Centric Scene
  Encoders For Manipulation Policies
Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies
Jianing Qian
Anastasios Panagopoulos
Dinesh Jayaraman
LM&RoViT
202
9
0
24 May 2024
Mixture of Experts Meets Prompt-Based Continual Learning
Mixture of Experts Meets Prompt-Based Continual LearningNeural Information Processing Systems (NeurIPS), 2024
Minh Le
An Nguyen
Huy Nguyen
Trang Nguyen
Trang Pham
L. Ngo
Nhat Ho
CLL
487
34
0
23 May 2024
Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection
Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly DetectionComputer Vision and Pattern Recognition (CVPR), 2024
Jia Guo
Shuai Lu
Weihang Zhang
Huiqi Li
Huiqi Li
Hongen Liao
ViT
537
44
0
23 May 2024
Transparency Distortion Robustness for SOTA Image Segmentation Tasks
Transparency Distortion Robustness for SOTA Image Segmentation Tasks
Volker Knauthe
Arne Rak
Tristan Wirth
Thomas Pollabauer
Simon Metzler
Arjan Kuijper
Dieter W. Fellner
175
3
0
21 May 2024
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Pavan Kumar Anasosalu Vasu
Hadi Pouransari
Fartash Faghri
Oncel Tuzel
VLMCLIP
241
9
0
14 May 2024
Efficient Vision-Language Pre-training by Cluster Masking
Efficient Vision-Language Pre-training by Cluster MaskingComputer Vision and Pattern Recognition (CVPR), 2024
Zihao Wei
Zixuan Pan
Andrew Owens
VLM
283
15
0
14 May 2024
Self-Distillation Improves DNA Sequence Inference
Self-Distillation Improves DNA Sequence Inference
Tong Yu
Lei Cheng
Ruslan Khalitov
Erland Brandser Olsson
Zhirong Yang
SyDa
180
1
0
14 May 2024
PLUTO: Pathology-Universal Transformer
PLUTO: Pathology-Universal Transformer
Dinkar Juyal
Harshith Padigela
Chintan Shah
Daniel Shenker
Natalia Harguindeguy
...
E. Walk
J. Abel
Harsha Pokkalla
A. Beck
S. Grullon
MedImViTLM&MA
194
19
0
13 May 2024
A Review on Discriminative Self-supervised Learning Methods in Computer Vision
A Review on Discriminative Self-supervised Learning Methods in Computer Vision
Nikolaos Giakoumoglou
Tania Stathaki
Athanasios Gkelias
SSL
397
1
0
08 May 2024
Intra-task Mutual Attention based Vision Transformer for Few-Shot
  Learning
Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning
Weihao Jiang
Yu Xie
Kun He
ViT
298
1
0
06 May 2024
Self-supervised Pre-training of Text Recognizers
Self-supervised Pre-training of Text Recognizers
M. Kišš
Michal Hradiš
SSL
177
2
0
01 May 2024
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
What Foundation Models can Bring for Robot Learning in Manipulation : A Survey
Dingzhe Li
Yixiang Jin
A. Yong
Yong A
Hongze Yu
...
Huaping Liu
Gang Hua
F. Sun
Jianwei Zhang
Bin Fang
AI4CELM&Ro
863
24
0
28 Apr 2024
Self-supervised visual learning in the low-data regime: a comparative
  evaluation
Self-supervised visual learning in the low-data regime: a comparative evaluation
Sotirios Konstantakos
Despina Ioanna Chalkiadaki
Ioannis Mademlis
Yuki M. Asano
E. Gavves
Georgios Th. Papadopoulos
272
9
0
26 Apr 2024
Road Surface Friction Estimation for Winter Conditions Utilising General
  Visual Features
Road Surface Friction Estimation for Winter Conditions Utilising General Visual Features
Risto Ojala
Eerik Alamikkotervo
78
4
0
25 Apr 2024
Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts
  in Environment and Sensor Domains
Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
Eunsu Baek
Keondo Park
Jiyoon Kim
Hyung-Sin Kim
OODDOOD
362
12
0
24 Apr 2024
OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining
  BEV Segmentation Networks
OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Sophia Sirko-Galouchenko
Alexandre Boulch
Spyros Gidaris
Andrei Bursuc
Antonín Vobecký
Patrick Pérez
Renaud Marlet
3DPC
273
17
0
22 Apr 2024
An Experimental Study on Exploring Strong Lightweight Vision
  Transformers via Masked Image Modeling Pre-Training
An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Jin Gao
Shubo Lin
Shaoru Wang
Yutong Kou
Zeming Li
Liang Li
Congxuan Zhang
Xiaoqin Zhang
Yizheng Wang
Weiming Hu
265
5
0
18 Apr 2024
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
EgoPet: Egomotion and Interaction Data from an Animal's Perspective
Amir Bar
Arya Bakhtiar
Danny Tran
Antonio Loquercio
Jathushan Rajasegaran
Yann LeCun
Amir Globerson
Trevor Darrell
EgoV
246
8
0
15 Apr 2024
XoFTR: Cross-modal Feature Matching Transformer
XoFTR: Cross-modal Feature Matching Transformer
Önder Tuzcuoglu
Aybora Köksal
Bugra Sofu
Sinan Kalkan
A. Aydin Alatan
ViT
154
32
0
15 Apr 2024
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning
Yuwei Tang
Zhenyi Lin
Qilong Wang
Q. Hu
Qinghua Hu
174
24
0
13 Apr 2024
Probing the 3D Awareness of Visual Foundation Models
Probing the 3D Awareness of Visual Foundation Models
Mohamed El Banani
Amit Raj
Kevis-Kokitsi Maninis
Abhishek Kar
Yuanzhen Li
Michael Rubinstein
Deqing Sun
Leonidas Guibas
Justin Johnson
Varun Jampani
309
125
0
12 Apr 2024
Emerging Property of Masked Token for Effective Pre-training
Emerging Property of Masked Token for Effective Pre-training
Hyesong Choi
Hunsang Lee
Seyoung Joung
Hyejin Park
Jiyeong Kim
Dongbo Min
165
10
0
12 Apr 2024
Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced
  Pre-training
Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training
Hyesong Choi
Hyejin Park
Kwang Moo Yi
Sungmin Cha
Dongbo Min
238
10
0
12 Apr 2024
Contrastive-Based Deep Embeddings for Label Noise-Resilient
  Histopathology Image Classification
Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification
Lucas Dedieu
Nicolas Nerrienet
A. Nivaggioli
Clara Simmat
Marceau Clavel
Arnaud Gauthier
Stéphane Sockeel
Rémy Peyret
NoLa
143
2
0
11 Apr 2024
Previous
123456...111213
Next