Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2111.07832
Cited By
v1
v2
v3 (latest)
iBOT: Image BERT Pre-Training with Online Tokenizer
15 November 2021
Jinghao Zhou
Chen Wei
Huiyu Wang
Wei Shen
Cihang Xie
Alan Yuille
Tao Kong
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"iBOT: Image BERT Pre-Training with Online Tokenizer"
50 / 602 papers shown
Title
A Stitch in Time: Learning Procedural Workflow via Self-Supervised Plackett-Luce Ranking
Chengan Che
Chao Wang
Xinyue Chen
Sophia Tsoka
Luis C. Garcia-Peraza-Herrera
AI4TS
146
0
0
21 Nov 2025
MuM: Multi-View Masked Image Modeling for 3D Vision
David Nordström
Johan Edstedt
Fredrik Kahl
Georg Bökman
116
0
0
21 Nov 2025
Unsupervised Image Classification with Adaptive Nearest Neighbor Selection and Cluster Ensembles
Melih Baydar
Emre Akbas
116
0
0
20 Nov 2025
Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
Minseok Seo
Mark Hamilton
Changick Kim
112
0
0
20 Nov 2025
MergeSlide: Continual Model Merging and Task-to-Class Prompt-Aligned Inference for Lifelong Learning on Whole Slide Images
Doanh C. Bui
Ba-Hung Ngo
H. Pham
Khang Phuoc-Quy Nguyen
Maï K. Nguyen
Y. Nakashima
CLL
MoMe
VLM
234
0
0
17 Nov 2025
SOMA: Feature Gradient Enhanced Affine-Flow Matching for SAR-Optical Registration
Haodong Wang
Tao Zhuo
Xiuwei Zhang
Hanlin Yin
Wencong Wu
Yanning Zhang
36
0
0
17 Nov 2025
Rank-Aware Agglomeration of Foundation Models for Immunohistochemistry Image Cell Counting
Z. Huang
Mengxin Tian
Huan Liu
W. Li
Baobao Liang
J. Wu
Fang Yan
Zhaoqing Tang
Z. Li
68
0
0
16 Nov 2025
CoMA: Complementary Masking and Hierarchical Dynamic Multi-Window Self-Attention in a Unified Pre-training Framework
Jiaxuan Li
Qing Xu
Xiangjian He
Ziyu Liu
Chang Xing
Zhen Chen
Daokun Zhang
Rong Qu
Chang Wen Chen
60
0
0
08 Nov 2025
Another BRIXEL in the Wall: Towards Cheaper Dense Features
Alexander Lappe
Martin A. Giese
108
0
0
07 Nov 2025
MedDChest: A Content-Aware Multimodal Foundational Vision Model for Thoracic Imaging
Mahmoud Soliman
Islam I. Osman
Mohamed S. Shehata
Rasika Rajapakshe
MedIm
142
0
0
06 Nov 2025
Self-supervised Synthetic Pretraining for Inference of Stellar Mass Embedded in Dense Gas
Keiya Hirashima
Shingo Nozaki
Naoto Harada
49
0
0
28 Oct 2025
DecoDINO: 3D Human-Scene Contact Prediction with Semantic Classification
Lukas Bierling
Davide Pasero
Fleur Dolmans
Helia Ghasemi
Angelo Broere
58
0
0
27 Oct 2025
Randomized-MLP Regularization Improves Domain Adaptation and Interpretability in DINOv2
Joel Valdivia Ortega
Lorenz Lamm
Franziska Eckardt
Benedikt Schworm
Marion Jasnin
Tingying Peng
MedIm
72
0
0
24 Oct 2025
VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
Jesimon Barreto
C. Caetano
A. Araújo
William Robson Schwartz
VLM
96
0
0
23 Oct 2025
From Masks to Worlds: A Hitchhiker's Guide to World Models
Jinbin Bai
Yu Lei
H. Wu
Yuchen Zhu
Shufan Li
Yi Xin
Xiangtai Li
Molei Tao
Aditya Grover
Ming-Hsuan Yang
VGen
SyDa
140
2
0
23 Oct 2025
Exploring Structural Degradation in Dense Representations for Self-supervised Learning
Siran Dai
Qianqian Xu
Peisong Wen
Yang Liu
Qingming Huang
92
1
0
20 Oct 2025
Comprehensive language-image pre-training for 3D medical image understanding
Tassilo Wald
Ibrahim Ethem Hamamci
Yuan Gao
Sam Bond-Taylor
H. Sharma
...
Klaus H. Maier-Hein
Panagiotis Korfiatis
Valentina Salvatelli
Javier Alvarez-Valle
Fernando Pérez-García
MedIm
VLM
100
0
0
16 Oct 2025
Towards Generalist Intelligence in Dentistry: Vision Foundation Models for Oral and Maxillofacial Radiology
Xinrui Huang
Fan Xiao
Dongming He
Anqi Gao
Dandan Li
Xiaofan Zhang
Shaoting Zhang
Xudong Wang
MedIm
LM&MA
169
0
0
16 Oct 2025
Semantic representations emerge in biologically inspired ensembles of cross-supervising neural networks
Roy Urbach
Elad Schneidman
SSL
116
0
0
16 Oct 2025
G2L:From Giga-Scale to Cancer-Specific Large-Scale Pathology Foundation Models via Knowledge Distillation
Yesung Cho
Sungmin Lee
Geongyu Lee
Minkyung Lee
JongBae Park
DongMyung Shin
44
0
0
13 Oct 2025
Diffusion Transformers with Representation Autoencoders
Boyang Zheng
Nanye Ma
Shengbang Tong
Saining Xie
DiffM
130
26
0
13 Oct 2025
Equipping Vision Foundation Model with Mixture of Experts for Out-of-Distribution Detection
Shizhen Zhao
Jiahui Liu
Xin Wen
Haoru Tan
Xiaojuan Qi
OODD
VLM
190
0
0
12 Oct 2025
GAS-MIL: Group-Aggregative Selection Multi-Instance Learning for Ensemble of Foundation Models in Digital Pathology Image Analysis
Peiran Quan
Zifan Gu
Zhuo Zhao
Qin Zhou
Peifeng Ruan
Ruichen Rong
Yang Xie
Tao Wang
AI4CE
72
0
0
03 Oct 2025
CLASP: Adaptive Spectral Clustering for Unsupervised Per-Image Segmentation
Max Curie
Paulo da Costa
VLM
70
0
0
29 Sep 2025
One-Prompt Strikes Back: Sparse Mixture of Experts for Prompt-based Continual Learning
Minh Le
Bao-Ngoc Dao
Huy Le Nguyen
Quyen Tran
Anh-Viêt Nguyên
Nhat Ho
CLL
MoE
92
0
0
29 Sep 2025
Benchmarking DINOv3 for Multi-Task Stroke Analysis on Non-Contrast CT
Donghao Zhang
Yimin Chen
Kauê TN Duarte
Taha Aslan
Mohamed AlShamrani
...
Yan Wan
Shengcai Chen
Bo Hu
Bijoy K Menon
Wu Qiu
52
0
0
27 Sep 2025
UNIV: Unified Foundation Model for Infrared and Visible Modalities
Fangyuan Mao
Shuo Wang
Jilin Mei
Chen Min
Shun Lu
Fuyang Liu
Xiaokun Feng
Meiqi Wu
Yu Hu
52
0
0
19 Sep 2025
Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation
Xiaoyu Yue
Zidong Wang
Yuqing Wang
Wenlong Zhang
Xihui Liu
Wanli Ouyang
Wenlong Zhang
Luping Zhou
GAN
193
2
0
18 Sep 2025
SAMIR, an efficient registration framework via robust feature learning from SAM
Yue He
Min Liu
Qinghao Liu
Jiazheng Wang
Yaonan Wang
Hang Zhang
Xiang Chen
MedIm
52
0
0
17 Sep 2025
Data Scaling Laws for Radiology Foundation Models
Maximilian Ilse
Harshita Sharma
Anton Schwaighofer
Sam Bond-Taylor
Fernando Pérez-García
...
Maria T. A. Wetscherek
Noel C. F. Codella
Javier Alvarez-Valle
Korfiatis Panagiotis
Valentina Salvatelli
MedIm
135
0
0
16 Sep 2025
LadderSym: A Multimodal Interleaved Transformer for Music Practice Error Detection
Benjamin Shiue-Hal Chou
Purvish Jajal
Nick Eliopoulos
James C. Davis
George K. Thiruvathukal
Kristen Yeon-Ji Yun
Yung-Hsiang Lu
104
0
0
16 Sep 2025
BATR-FST: Bi-Level Adaptive Token Refinement for Few-Shot Transformers
Mohammed Al-Habib
Zuping Zhang
Abdulrahman Noman
56
0
0
16 Sep 2025
Disentangling Content from Style to Overcome Shortcut Learning: A Hybrid Generative-Discriminative Learning Framework
Siming Fu
Sijun Dong
Xiaoliang Meng
241
0
0
15 Sep 2025
Domain-Adaptive Pretraining Improves Primate Behavior Recognition
Felix B. Mueller
Timo Lueddecke
Richard Vogg
Alexander S. Ecker
81
1
0
15 Sep 2025
LayerLock: Non-collapsing Representation Learning with Progressive Freezing
Goker Erdogan
Nikhil Parthasarathy
Catalin Ionescu
Drew A. Hudson
Alexander Lerchner
Andrew Zisserman
Mehdi S. M. Sajjadi
João Carreira
104
0
0
12 Sep 2025
DualTrack: Sensorless 3D Ultrasound needs Local and Global Context
P. Wilson
Matteo Ronchetti
Rüdiger Göbl
Viktoria Markova
Sebastian Rosenzweig
R. Prevost
P. Mousavi
O. Zettinig
52
0
0
11 Sep 2025
Semantic Concentration for Self-Supervised Dense Representations Learning
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025
Peisong Wen
Qianqian Xu
Siran Dai
Runmin Cong
Qingming Huang
108
2
0
11 Sep 2025
PeftCD: Leveraging Vision Foundation Models with Parameter-Efficient Fine-Tuning for Remote Sensing Change Detection
Sijun Dong
Yuxuan Hu
Libo Wang
Geng Chen
Xiaoliang Meng
92
1
0
11 Sep 2025
Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening
Piyush Bagad
Andrew Zisserman
AI4TS
196
2
0
10 Sep 2025
QualityFM: a Multimodal Physiological Signal Foundation Model with Self-Distillation for Signal Quality Challenges in Critically Ill Patients
Zongheng Guo
Tao Chen
Manuela Ferrario
73
0
0
08 Sep 2025
Patch-Level Kernel Alignment for Dense Self-Supervised Learning
Juan Yeo
Ijun Jang
Taesup Kim
SSL
MDE
211
0
0
06 Sep 2025
Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views
Xiangdong Zhang
Shaofeng Zhang
Junchi Yan
3DPC
145
1
0
01 Sep 2025
OmniMRI: A Unified Vision--Language Foundation Model for Generalist MRI Interpretation
Xingxin He
Aurora Rofena
Ruimin Feng
Haozhe Liao
Zhaoye Zhou
Albert Jang
Fang Liu
MedIm
62
0
0
24 Aug 2025
Boosting Pathology Foundation Models via Few-shot Prompt-tuning for Rare Cancer Subtyping
Dexuan He
Xiao Zhou
Wenbin Guan
Liyuan Zhang
Xiaoman Zhang
...
Xin Sun
Yanfeng Wang
Kun Sun
Ya Zhang
Weidi Xie
VLM
75
0
0
21 Aug 2025
MATPAC++: Enhanced Masked Latent Prediction for Self-Supervised Audio Representation Learning
Aurian Quélennec
Pierre Chouteau
Geoffroy Peeters
S. Essid
116
0
0
18 Aug 2025
RISE: Enhancing VLM Image Annotation with Self-Supervised Reasoning
Suhang Hu
Wei Hu
Yuhang Su
Fan Zhang
ReLM
LRM
VLM
216
0
0
17 Aug 2025
DermINO: Hybrid Pretraining for a Versatile Dermatology Foundation Model
Jingkai Xu
De Cheng
Xiangqian Zhao
Jungang Yang
Zilong Wang
...
Jianming Liang
Lili Qiu
Nannan Wang
Xianbo Zuo
Cui Yong
MedIm
145
0
0
17 Aug 2025
MAESTRO: Masked AutoEncoders for Multimodal, Multitemporal, and Multispectral Earth Observation Data
Antoine Labatie
Michael Vaccaro
Nina Lardiere
A. Garioud
Nicolas Gonthier
168
0
0
14 Aug 2025
Towards Comprehensive Cellular Characterisation of H&E slides
Benjamin Adjadj
Pierre-Antoine Bannier
Guillaume Horent
Sebastien Mandela
Aurore Lyon
...
Reda Belbahri
Benoît Schmauch
Eric Durand
Katharina Von Loga
Lucie Gillet
VLM
96
0
0
13 Aug 2025
CoMAD: A Multiple-Teacher Self-Supervised Distillation Framework
Sriram Mandalika
Lalitha V
MoE
VLM
110
0
0
06 Aug 2025
1
2
3
4
...
11
12
13
Next