Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales

Terms and Conditions

Twitter GitHub LinkedIn Bluesky Youtube

© 2026 ResearchTrend.AI, All rights reserved.

Home
Papers
2111.07832
Cited By

iBOT: Image BERT Pre-Training with Online Tokenizer

v1v2v3 (latest)

iBOT: Image BERT Pre-Training with Online Tokenizer

15 November 2021

Cihang Xie

ArXiv (abs)PDF HTML

Papers citing "iBOT: Image BERT Pre-Training with Online Tokenizer"

50 / 607 papers shown

Learning from Memory: Non-Parametric Memory Augmented Self-Supervised
Learning of Visual Features

Learning from Memory: Non-Parametric Memory Augmented Self-Supervised Learning of Visual Features

Adín Ramírez Rivera

178

6

0

03 Jul 2024

Multi-Grained Contrast for Data-Efficient Unsupervised Representation
Learning

Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning

272

3

0

02 Jul 2024

Foundational Models for Pathology and Endoscopy Images: Application for
Gastric Inflammation

Foundational Models for Pathology and Endoscopy Images: Application for Gastric Inflammation

Dennis Veselkov

...

Junior Andrea Pescino

M. Dinis-Ribeiro

T. F. Kanonnikoff

Kirill Veselkov

421

6

0

26 Jun 2024

3D-MVP: 3D Multiview Pretraining for Robotic Manipulation

3D-MVP: 3D Multiview Pretraining for Robotic Manipulation

Dieter Fox

205

6

0

26 Jun 2024

Investigating Self-Supervised Methods for Label-Efficient Learning

Investigating Self-Supervised Methods for Label-Efficient Learning

Josef Kittler

214

2

0

25 Jun 2024

Pseudo Labelling for Enhanced Masked Autoencoders

Pseudo Labelling for Enhanced Masked Autoencoders

Josef Kittler

188

1

0

25 Jun 2024

Masked Generative Extractor for Synergistic Representation and 3D
Generation of Point Clouds

Masked Generative Extractor for Synergistic Representation and 3D Generation of Point Clouds

340

1

0

25 Jun 2024

HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image
Analysis

HEST-1k: A Dataset for Spatial Transcriptomics and Histology Image Analysis

Guillaume Jaume

Cristina Almagro-Pérez

...

Anurag J. Vaidya

Richard J. Chen

Drew F. K. Williamson

362

83

0

23 Jun 2024

A Simple Framework for Open-Vocabulary Zero-Shot Segmentation

A Simple Framework for Open-Vocabulary Zero-Shot Segmentation

Thomas Stegmüller

Behzad Bozorgtabar

Tinne Tuytelaars

Jean-Philippe Thiran

425

3

0

23 Jun 2024

Duoduo CLIP: Efficient 3D Understanding with Multi-View Images

Duoduo CLIP: Efficient 3D Understanding with Multi-View Images

561

4

0

17 Jun 2024

ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts

ExPLoRA: Parameter-Efficient Extended Pre-Training to Adapt Vision Transformers under Domain Shifts

David B. Lobell

676

14

0

16 Jun 2024

SemanticMIM: Marring Masked Image Modeling with Semantics Compression
for General Visual Representation

SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation

254

2

0

15 Jun 2024

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Oğuzhan Fatih Kar

Mingfei Gao

David Griffiths

271

33

0

13 Jun 2024

Image and Video Tokenization with Binary Spherical Quantization

Image and Video Tokenization with Binary Spherical Quantization

Philipp Krahenbuhl

263

59

0

11 Jun 2024

Let Go of Your Labels with Unsupervised Transfer

Let Go of Your Labels with Unsupervised Transfer

Artyom Gadetsky

241

13

0

11 Jun 2024

SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale

SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale

Shester Gueuwou

G. Shakhnarovich

336

9

0

11 Jun 2024

The 3D-PC: a benchmark for visual perspective taking in humans and machines

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Francis E Lewis

420

10

0

06 Jun 2024

Enhancing 2D Representation Learning with a 3D Prior

Enhancing 2D Representation Learning with a 3D Prior

Prithviraj Dhar

Oisin Mac Aodha

218

1

0

04 Jun 2024

An Empirical Study into Clustering of Unseen Datasets with
Self-Supervised Encoders

An Empirical Study into Clustering of Unseen Datasets with Self-Supervised Encoders

Scott C. Lowe

Joakim Bruslund Haurum

Graham W. Taylor

238

5

0

04 Jun 2024

Scaling Up Deep Clustering Methods Beyond ImageNet-1K

Scaling Up Deep Clustering Methods Beyond ImageNet-1K

Félix D. P. Michels

Diana Petrusheva

269

2

0

03 Jun 2024

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign
Language Recognition

MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition

Min Wang

201

22

0

31 May 2024

Multi-Label Guided Soft Contrastive Learning for Efficient Earth
Observation Pretraining

Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining

Xiao Xiang Zhu

287

17

0

30 May 2024

MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning

MLAE: Masked LoRA Experts for Parameter-Efficient Fine-Tuning

267

0

0

29 May 2024

In-Context Symmetries: Self-Supervised Learning through Contextual World
Models

In-Context Symmetries: Self-Supervised Learning through Contextual World Models

Stefanie Jegelka

248

5

0

28 May 2024

Visualizing the loss landscape of Self-supervised Vision Transformer

Visualizing the loss landscape of Self-supervised Vision Transformer

Jeffrey Willette

211

1

0

28 May 2024

DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive
Architecture

DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture

Shentong Mo

236

4

0

28 May 2024

Recasting Generic Pretrained Vision Transformers As Object-Centric Scene
Encoders For Manipulation Policies

Recasting Generic Pretrained Vision Transformers As Object-Centric Scene Encoders For Manipulation Policies

Anastasios Panagopoulos

Dinesh Jayaraman

218

9

0

24 May 2024

Mixture of Experts Meets Prompt-Based Continual Learning

Mixture of Experts Meets Prompt-Based Continual LearningNeural Information Processing Systems (NeurIPS), 2024

Huy Nguyen

503

35

0

23 May 2024

Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly Detection

Dinomaly: The Less Is More Philosophy in Multi-Class Unsupervised Anomaly DetectionComputer Vision and Pattern Recognition (CVPR), 2024

Huiqi Li

Hongen Liao

568

49

0

23 May 2024

Transparency Distortion Robustness for SOTA Image Segmentation Tasks

Transparency Distortion Robustness for SOTA Image Segmentation Tasks

Thomas Pollabauer

Dieter W. Fellner

195

3

0

21 May 2024

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

Pavan Kumar Anasosalu Vasu

Hadi Pouransari

261

9

0

14 May 2024

Efficient Vision-Language Pre-training by Cluster Masking

Efficient Vision-Language Pre-training by Cluster MaskingComputer Vision and Pattern Recognition (CVPR), 2024

312

15

0

14 May 2024

Self-Distillation Improves DNA Sequence Inference

Self-Distillation Improves DNA Sequence Inference

Ruslan Khalitov

Erland Brandser Olsson

189

1

0

14 May 2024

PLUTO: Pathology-Universal Transformer

PLUTO: Pathology-Universal Transformer

Harshith Padigela

Natalia Harguindeguy

...

Harsha Pokkalla

MedIm ViT LM&MA

202

20

0

13 May 2024

A Review on Discriminative Self-supervised Learning Methods in Computer Vision

A Review on Discriminative Self-supervised Learning Methods in Computer Vision

Nikolaos Giakoumoglou

Athanasios Gkelias

437

1

0

08 May 2024

Intra-task Mutual Attention based Vision Transformer for Few-Shot
Learning

Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning

334

1

0

06 May 2024

Self-supervised Pre-training of Text Recognizers

Self-supervised Pre-training of Text Recognizers

200

2

0

01 May 2024

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

What Foundation Models can Bring for Robot Learning in Manipulation : A Survey

...

900

26

0

28 Apr 2024

Self-supervised visual learning in the low-data regime: a comparative
evaluation

Self-supervised visual learning in the low-data regime: a comparative evaluation

Sotirios Konstantakos

Despina Ioanna Chalkiadaki

Ioannis Mademlis

Georgios Th. Papadopoulos

278

9

0

26 Apr 2024

Road Surface Friction Estimation for Winter Conditions Utilising General
Visual Features

Road Surface Friction Estimation for Winter Conditions Utilising General Visual Features

Eerik Alamikkotervo

98

4

0

25 Apr 2024

Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts
in Environment and Sensor Domains

Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains

393

12

0

24 Apr 2024

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining
BEV Segmentation Networks

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Sophia Sirko-Galouchenko

Alexandre Boulch

Antonín Vobecký

321

17

0

22 Apr 2024

An Experimental Study on Exploring Strong Lightweight Vision
Transformers via Masked Image Modeling Pre-Training

An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

284

6

0

18 Apr 2024

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

Antonio Loquercio

Jathushan Rajasegaran

278

8

0

15 Apr 2024

XoFTR: Cross-modal Feature Matching Transformer

XoFTR: Cross-modal Feature Matching Transformer

Önder Tuzcuoglu

A. Aydin Alatan

167

33

0

15 Apr 2024

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

AMU-Tuning: Effective Logit Bias for CLIP-based Few-shot Learning

203

24

0

13 Apr 2024

Probing the 3D Awareness of Visual Foundation Models

Probing the 3D Awareness of Visual Foundation Models

Mohamed El Banani

Kevis-Kokitsi Maninis

Yuanzhen Li

Michael Rubinstein

Leonidas Guibas

325

127

0

12 Apr 2024

Emerging Property of Masked Token for Effective Pre-training

Emerging Property of Masked Token for Effective Pre-training

Hyesong Choi

Hyejin Park

Dongbo Min

170

10

0

12 Apr 2024

Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced
Pre-training

Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training

Hyesong Choi

Hyejin Park

Dongbo Min

271

10

0

12 Apr 2024

Contrastive-Based Deep Embeddings for Label Noise-Resilient
Histopathology Image Classification

Contrastive-Based Deep Embeddings for Label Noise-Resilient Histopathology Image Classification

Nicolas Nerrienet

Arnaud Gauthier

Stéphane Sockeel

147

2

0

11 Apr 2024

1 2 3 4 5 6...11 12 13