Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,169 papers shown
Title
Quantum Speedups for Markov Chain Monte Carlo Methods with Application to Optimization
Guneykan Ozgul
Xiantao Li
Mehrdad Mahdavi
Chunhao Wang
34
1
0
04 Apr 2025
Dynamic Objective MPC for Motion Planning of Seamless Docking Maneuvers
Oliver Schumann
Michael Buchholz
Klaus C. J. Dietmayer
38
0
0
04 Apr 2025
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Mateusz Pach
Shyamgopal Karthik
Quentin Bouniot
Serge Belongie
Zeynep Akata
VLM
62
0
0
03 Apr 2025
BOP Challenge 2024 on Model-Based and Model-Free 6D Object Pose Estimation
Van Nguyen Nguyen
Stephen Tyree
Andrew Guo
Mederic Fourmy
Anas Gouda
...
Stan Birchfield
Jiri Matas
Yann Labbé
M. Sundermeyer
Tomás Hodan
3DPC
48
1
0
03 Apr 2025
Towards Generalizing Temporal Action Segmentation to Unseen Views
Emad Bahrami
Olga Zatsarynna
Gianpiero Francesca
Juergen Gall
EgoV
38
0
0
03 Apr 2025
Agglomerating Large Vision Encoders via Distillation for VFSS Segmentation
Chengxi Zeng
Yuxuan Jiang
Fan Zhang
A. Gambaruto
T. Burghardt
MedIm
40
0
0
03 Apr 2025
PicoPose: Progressive Pixel-to-Pixel Correspondence Learning for Novel Object Pose Estimation
Lihua Liu
Jiehong Lin
Zhenxin Liu
Kui Jia
33
0
0
03 Apr 2025
Multimodal Reference Visual Grounding
Yangxiao Lu
Ruosen Li
Liqiang Jing
Jikai Wang
Xinya Du
Yunhui Guo
Nicholas Ruozzi
Yu Xiang
ObjD
76
0
0
02 Apr 2025
Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis
Zixuan Wang
Duo Peng
Feng Chen
Y. Yang
Yinjie Lei
DiffM
74
0
0
02 Apr 2025
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval
Yuji Nozawa
Yu Lin
Kazumoto Nakamura
Youyang Ng
38
0
0
02 Apr 2025
ProtoGCD: Unified and Unbiased Prototype Learning for Generalized Category Discovery
Shijie Ma
Fei Zhu
Xu-Yao Zhang
Cheng-Lin Liu
27
1
0
02 Apr 2025
UniViTAR: Unified Vision Transformer with Native Resolution
Limeng Qiao
Yiyang Gan
Bairui Wang
Jie Qin
Shuang Xu
Siqi Yang
Lin Ma
50
0
0
02 Apr 2025
All Patches Matter, More Patches Better: Enhance AI-Generated Image Detection via Panoptic Patch Learning
Zheng Yang
Ruoxin Chen
Zhiyuan Yan
Ke-Yue Zhang
Xinghe Fu
...
Xiujun Shu
Taiping Yao
Junchi Yan
Shouhong Ding
Xi Li
29
0
0
02 Apr 2025
Anomaly Detection for Hybrid Butterfly Subspecies via Probability Filtering
Bo-Kai Ruan
Yi-Zeng Fang
Hong-Han Shuai
Juinn-Dar Huang
40
0
0
02 Apr 2025
Scene-Centric Unsupervised Panoptic Segmentation
Oliver Hahn
Christoph Reich
Nikita Araslanov
Daniel Cremers
Christian Rupprecht
Stefan Roth
OCL
57
0
0
02 Apr 2025
Slot-Level Robotic Placement via Visual Imitation from Single Human Video
Dandan Shan
Kaichun Mo
Wei Yang
Yu-Wei Chao
David Fouhey
Dieter Fox
Arsalan Mousavian
36
0
0
02 Apr 2025
v-CLR: View-Consistent Learning for Open-World Instance Segmentation
Chang-Bin Zhang
Jinhong Ni
Yujie Zhong
Kai Han
3DV
VLM
57
0
0
02 Apr 2025
A Diffusion-Based Framework for Occluded Object Movement
Zheng-Peng Duan
Jiawei Zhang
Siyu Liu
Zheng Lin
Chun-Le Guo
Dongqing Zou
Jimmy S. Ren
Chongyi Li
36
0
0
02 Apr 2025
Scaling Language-Free Visual Representation Learning
David Fan
Shengbang Tong
Jiachen Zhu
Koustuv Sinha
Zhuang Liu
...
Michael G. Rabbat
Nicolas Ballas
Yann LeCun
Amir Bar
Saining Xie
CLIP
VLM
56
2
0
01 Apr 2025
Shot-by-Shot: Film-Grammar-Aware Training-Free Audio Description Generation
Junyu Xie
Tengda Han
Max Bain
Arsha Nagrani
Eshika Khandelwal
Gül Varol
Weidi Xie
Andrew Zisserman
DiffM
VGen
55
0
0
01 Apr 2025
Distilling Multi-view Diffusion Models into 3D Generators
Hao Qin
Luyuan Chen
Ming Kong
Mengxu Lu
Qiang Zhu
3DGS
64
0
0
01 Apr 2025
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker
Letian Jiang
Chen Zhao
Bernard Ghanem
53
0
0
01 Apr 2025
Coca-Splat: Collaborative Optimization for Camera Parameters and 3D Gaussians
Jiamin Wu
Hongyang Li
Xiaoke Jiang
Yuan Yao
Lei Zhang
3DGS
49
0
0
01 Apr 2025
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Siyuan Li
L. Zhang
Zedong Wang
Juanxi Tian
Cheng Tan
...
Chang Yu
Qingsong Xie
Haonan Lu
Haoqian Wang
Zhen Lei
46
0
0
01 Apr 2025
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Tian-Xing Xu
Xiangjun Gao
Wenbo Hu
Xiaoyu Li
Song-Hai Zhang
Ying Shan
VGen
MDE
56
1
0
01 Apr 2025
Spingarn's Method and Progressive Decoupling Beyond Elicitable Monotonicity
B. Evens
P. Latafat
Panagiotis Patrinos
46
0
0
01 Apr 2025
DecoFuse: Decomposing and Fusing the "What", "Where", and "How" for Brain-Inspired fMRI-to-Video Decoding
Chong Li
Jingyang Huo
Weikang Gong
Yanwei Fu
Xiangyang Xue
Jianfeng Feng
38
0
0
01 Apr 2025
GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology
S. Kapse
Pushpak Pati
Srikar Yellapragada
Srijan Das
Rajarsi R. Gupta
Joel H. Saltz
Dimitris Samaras
Prateek Prasanna
VLM
41
0
0
01 Apr 2025
CBIL: Collective Behavior Imitation Learning for Fish from Real Videos
Yifan Wu
Zhiyang Dou
Yuko Ishiwaka
Shun Ogawa
Yuke Lou
Wenping Wang
Lingjie Liu
Taku Komura
40
3
0
31 Mar 2025
Detecting Glioma, Meningioma, and Pituitary Tumors, and Normal Brain Tissues based on Yolov11 and Yolov8 Deep Learning Models
Ahmed M. Taha
Salah A. Aly
Mohamed F. Darwish
31
0
0
31 Mar 2025
From Colors to Classes: Emergence of Concepts in Vision Transformers
Teresa Dorszewski
Lenka Tětková
Robert Jenssen
Lars Kai Hansen
Kristoffer Wickstrøm
37
0
0
31 Mar 2025
Leveraging Diffusion Model and Image Foundation Model for Improved Correspondence Matching in Coronary Angiography
Lin Zhao
Xin Yu
Yikang Liu
Xiao Chen
Eric Z. Chen
Terrence Chen
Shanhui Sun
DiffM
MedIm
40
0
0
31 Mar 2025
Multi-Task Learning for Extracting Menstrual Characteristics from Clinical Notes
Anna Shopova
Cristoph Lippert
Leslee J. Shaw
Eugenia Alleva
42
0
0
31 Mar 2025
AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos
Felix Wimbauer
Weirong Chen
Dominik Muhle
Christian Rupprecht
Daniel Cremers
VGen
65
0
0
30 Mar 2025
Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model
Jannik Endres
Oliver Hahn
Charles Corbière
Simone Schaub-Meyer
Stefan Roth
Alexandre Alahi
MDE
37
0
0
30 Mar 2025
COSMIC: Clique-Oriented Semantic Multi-space Integration for Robust CLIP Test-Time Adaptation
Fanding Huang
Jingyan Jiang
Qinting Jiang
Hebei Li
Faisal Nadeem Khan
Zhi Wang
VLM
50
0
0
30 Mar 2025
Improved Ear Verification with Vision Transformers and Overlapping Patches
Deeksha Arun
Kagan Öztürk
Kevin W. Bowyer
Patrick Flynn
30
0
0
30 Mar 2025
VideoGen-Eval: Agent-based System for Video Generation Evaluation
Yuhang Yang
Ke Fan
S.
Hongxiang Li
Ailing Zeng
FeiLin Han
Wei-dong Zhai
W. Liu
Yang Cao
Zheng-jun Zha
EGVM
VGen
73
0
0
30 Mar 2025
Diffusion Meets Few-shot Class Incremental Learning
Junsu Kim
Yunhoe Ku
Dongyoon Han
Seungryul Baek
DiffM
CLL
42
0
0
30 Mar 2025
DASH: Detection and Assessment of Systematic Hallucinations of VLMs
Maximilian Augustin
Yannic Neuhaus
Matthias Hein
VLM
47
1
0
30 Mar 2025
Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection
Marc-Antoine Lavoie
Anas Mahmoud
Steven Waslander
37
0
0
29 Mar 2025
MeshCraft: Exploring Efficient and Controllable Mesh Generation with Flow-based DiTs
Xianglong He
Junyi Chen
Di Huang
Zexiang Liu
Xiaoshui Huang
Wanli Ouyang
C. Yuan
Yangguang Li
DiffM
52
0
0
29 Mar 2025
OncoReg: Medical Image Registration for Oncological Challenges
Wiebke Heyer
Yannic Elser
Lennart Berkel
Xinrui Song
Xuanang Xu
...
Christoph Großbröhmer
Lasse Hansen
Alessa Hering
Malte M. Sieren
Mattias P. Heinrich
36
0
0
29 Mar 2025
Multi-label classification for multi-temporal, multi-spatial coral reef condition monitoring using vision foundation model with adapter learning
Xinlei Shao
Hongruixuan Chen
Fan Zhao
Kirsty Magson
Jundong Chen
Peiran Li
J. Wang
Jun Sasaki
44
0
0
29 Mar 2025
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
Ziyue Huang
Hongxi Yan
Qiqi Zhan
Shuai Yang
Mingming Zhang
Chenkai Zhang
Yiming Lei
Zeming Liu
Qingjie Liu
Y. Wang
42
0
0
28 Mar 2025
Understanding Co-speech Gestures in-the-wild
Sindhu B. Hegde
KR Prajwal
Taein Kwon
Andrew Zisserman
SLR
52
0
0
28 Mar 2025
A Proposal for Networks Capable of Continual Learning
Zeki Doruk Erden
Boi Faltings
CLL
42
0
0
28 Mar 2025
MVSAnywhere: Zero-Shot Multi-View Stereo
Sergio Izquierdo
Mohamed Sayed
Michael Firman
Guillermo Garcia-Hernando
Daniyar Turmukhambetov
Javier Civera
Oisin Mac Aodha
Gabriel J. Brostow
Jamie Watson
3DV
39
3
0
28 Mar 2025
AnnoPage Dataset: Dataset of Non-Textual Elements in Documents with Fine-Grained Categorization
Martin Kiss
Michal Hradiš
Martina Dvořáková
Václav Jiroušek
Filip Kersch
43
1
0
28 Mar 2025
High-Fidelity Diffusion Face Swapping with ID-Constrained Facial Conditioning
Dailan He
X. Wang
Shulun Wang
Guanglu Song
Bingqi Ma
Hao Shao
Y. Liu
Hongsheng Li
DiffM
60
0
0
28 Mar 2025
Previous
1
2
3
4
5
6
...
42
43
44
Next