v1v2v3 (latest)

iBOT: Image BERT Pre-Training with Online Tokenizer

15 November 2021

Cihang Xie

Papers citing "iBOT: Image BERT Pre-Training with Online Tokenizer"

50 / 607 papers shown

Training state-of-the-art pathology foundation models with orders of magnitude less dataInternational Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2025

161

07 Apr 2025

REJEPA: A Novel Joint-Embedding Predictive Architecture for Efficient Remote Sensing Image Retrieval

301

04 Apr 2025

Pairwise Optimal Transports for Training All-to-All Flow-Based Condition Transfer Model

1.2K

04 Apr 2025

Refining CLIP's Spatial Awareness: A Visual-Centric PerspectiveInternational Conference on Learning Representations (ICLR), 2025

307

03 Apr 2025

Scaling Language-Free Visual Representation Learning

...

448

01 Apr 2025

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and QuantizationComputer Vision and Pattern Recognition (CVPR), 2025

...

299

01 Apr 2025

ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models

Guoyizhe Wei

Rama Chellappa

301

30 Mar 2025

Masked Self-Supervised Pre-Training for Text Recognition Transformers on Large-Scale Datasets

Martin Kiss

Michal Hradiš

216

28 Mar 2025

Vanishing Depth: A Depth Adapter with Positional Depth Encoding for Generalized Image Encoders

284

25 Mar 2025

ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning

Chau Pham

Juan C. Caicedo

Bryan A. Plummer

318

25 Mar 2025

Self-Supervised Learning based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation

251

24 Mar 2025

Structured-Noise Masked Modeling for Video, Audio and Beyond

326

20 Mar 2025

Cube: A Roblox View of 3D Intelligence

Foundation AI Team Roblox

...

287

19 Mar 2025

Object-Centric Pretraining via Target Encoder BootstrappingInternational Conference on Learning Representations (ICLR), 2025

284

19 Mar 2025

Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis

Imanol G. Estepa

Jesús M. Rodríguez-de-Vera

Ignacio Sarasúa

Bhalaji Nagarajan

Petia Radeva

447

19 Mar 2025

Quantum EigenGame for excited state calculation

David Quiroga

Jason Han

Anastasios Kyrillidis

283

17 Mar 2025

Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation

271

13 Mar 2025

Robustness Tokens: Towards Adversarial Robustness of TransformersEuropean Conference on Computer Vision (ECCV), 2025

243

13 Mar 2025

Freeze and Cluster: A Simple Baseline for Rehearsal-Free Continual Category Discovery

428

12 Mar 2025

Multi-Modal Foundation Models for Computational Pathology: A Survey

452

12 Mar 2025

Task-Agnostic Attacks Against Vision Foundation Models

236

05 Mar 2025

Solving Instance Detection from an Open-World PerspectiveComputer Vision and Pattern Recognition (CVPR), 2025

417

01 Mar 2025

MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and RetentionIEEE Transactions on Medical Imaging (IEEE TMI), 2025

596

01 Mar 2025

Projection Head is Secretly an Information BottleneckInternational Conference on Learning Representations (ICLR), 2025

347

01 Mar 2025

MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations

Benedikt Alkin

Lukas Miklautz

Sepp Hochreiter

Johannes Brandstetter

VLM

517

24 Feb 2025

Simplifying DINO via Coding Rate Regularization

1.4K

17 Feb 2025

Masked Latent Prediction and Classification for Self-Supervised Audio Representation LearningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025

372

17 Feb 2025

From Pixels to Components: Eigenvector Masking for Visual Representation Learning

712

10 Feb 2025

Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment

Konstantinos G. Derpanis

349

06 Feb 2025

A generalizable 3D framework and model for self-supervised learning in medical imaging

343

20 Jan 2025

How Well Do Supervised 3D Models Transfer to Medical Imaging Tasks?

321

20 Jan 2025

Keypoint Aware Masked Image ModellingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

Madhava Krishna

Convin.AI

456

03 Jan 2025

The Dynamic Duo of Collaborative Masking and Target for Advanced Masked Autoencoder LearningAAAI Conference on Artificial Intelligence (AAAI), 2024

Shentong Mo

246

23 Dec 2024

Equivariant Representation Learning for Augmentation-based Self-Supervised Learning via Image Reconstruction

209

04 Dec 2024

Gen-SIS: Generative Self-augmentation Improves Self-supervised Learning

309

02 Dec 2024

Probing the Mid-level Vision Capabilities of Self-Supervised LearningComputer Vision and Pattern Recognition (CVPR), 2024

Xuweiyi Chen

Markus Marks

Zezhou Cheng

484

25 Nov 2024

Multi-Token Enhancing for Vision Representation Learning

460

24 Nov 2024

PR-MIM: Delving Deeper into Partial Reconstruction in Masked Image Modeling

355

24 Nov 2024

A Survey of Recent Advances and Challenges in Deep Audio-Visual Correlation LearningACM Computing Surveys (ACM CSUR), 2024

Luis Vilaca

Yi Yu

Paula Vinan

476

24 Nov 2024

Relational Contrastive Learning and Masked Image Modeling for Scene Text Recognition

360

18 Nov 2024

Free Lunch in Pathology Foundation Model: Task-specific Model Adaptation with Concept-Guided Feature EnhancementNeural Information Processing Systems (NeurIPS), 2024

276

15 Nov 2024

CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation

689

15 Nov 2024

Understanding the Role of Equivariance in Self-supervised LearningNeural Information Processing Systems (NeurIPS), 2024

319

10 Nov 2024

Pattern Integration and Enhancement Vision Transformer for Self-Supervised Learning in Remote SensingIEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024

233

09 Nov 2024

Classification Done Right for Vision-Language Pre-TrainingNeural Information Processing Systems (NeurIPS), 2024

421

05 Nov 2024

Masked Autoencoders are Parameter-Efficient Federated Continual LearnersBigData Congress [Services Society] (BSS), 2024

Yuchen He

Xiangfeng Wang

CLL FedML

285

04 Nov 2024

Sparsh: Self-supervised touch representations for vision-based tactile sensingConference on Robot Learning (CoRL), 2024

Carolina Higuera

Akash Sharma

Chaithanya Krishna Bodduluri

...

275

31 Oct 2024

A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization

Jingren Liu

339

29 Oct 2024

Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation ModelsNeural Information Processing Systems (NeurIPS), 2024

294

25 Oct 2024

Connecting Joint-Embedding Predictive Architecture with Contrastive Self-supervised LearningNeural Information Processing Systems (NeurIPS), 2024

Shentong Mo

Shengbang Tong

321

25 Oct 2024