Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,189 papers shown
Title
MTReD: 3D Reconstruction Dataset for Fly-over Videos of Maritime Domain
Rui Yi Yong
Samuel Picosson
Arnold Wiliem
32
0
0
02 Mar 2025
Foundation Models Secretly Understand Neural Network Weights: Enhancing Hypernetwork Architectures with Foundation Models
Jeffrey Gu
Serena Yeung-Levy
AI4CE
29
0
0
02 Mar 2025
MIRROR: Multi-Modal Pathological Self-Supervised Representation Learning via Modality Alignment and Retention
Tianyi Wang
Jianan Fan
Dingxin Zhang
Dongnan Liu
Yong-quan Xia
Heng Huang
Weidong Cai
34
0
0
01 Mar 2025
A Guide to Failure in Machine Learning: Reliability and Robustness from Foundations to Practice
Eric Heim
Oren Wright
David Shriver
OOD
FaML
63
0
0
01 Mar 2025
Learning to Animate Images from A Few Videos to Portray Delicate Human Actions
Haoxin Li
Yingchen Yu
Qilong Wu
Hanwang Zhang
Boyang Li
Song Bai
3DH
VGen
120
0
0
01 Mar 2025
Solving Instance Detection from an Open-World Perspective
Qianqian Shen
Yunhan Zhao
Nahyun Kwon
Jeeeun Kim
Yanan Li
Shu Kong
32
0
0
01 Mar 2025
Bring Your Own Grasp Generator: Leveraging Robot Grasp Generation for Prosthetic Grasping
Giuseppe Stracquadanio
Federico Vasile
Elisa Maiettini
Nicoló Boccardo
Lorenzo Natale
27
0
0
01 Mar 2025
CNSv2: Probabilistic Correspondence Encoded Neural Image Servo
Anzhe Chen
Hongxiang Yu
Shuxin Li
Yuxi Chen
Zhongxiang Zhou
Wentao Sun
R. Xiong
Y. Wang
34
0
0
28 Feb 2025
SciceVPR: Stable Cross-Image Correlation Enhanced Model for Visual Place Recognition
Shanshan Wan
Yingmei Wei
Lai Kang
Tianrui Shen
Haixuan Wang
Yee-Hong Yang
41
0
0
28 Feb 2025
Multimodal Dreaming: A Global Workspace Approach to World Model-Based Reinforcement Learning
Léopold Maytié
Roland Bertin Johannet
Rufin VanRullen
OffRL
32
0
0
28 Feb 2025
Ext2Gen: Alignment through Unified Extraction and Generation for Robust Retrieval-Augmented Generation
Hwanjun Song
J. Choi
Minseok Kim
RALM
3DV
66
0
0
28 Feb 2025
UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler
Luigi Piccinelli
Christos Sakaridis
Y. Yang
Mattia Segu
Siyuan Li
Wim Abbeloos
Luc Van Gool
MDE
41
6
0
27 Feb 2025
Enhanced Contrastive Learning with Multi-view Longitudinal Data for Chest X-ray Report Generation
Kang Liu
Zhuoqi Ma
Xiaolu Kang
Yunan Li
Kun Xie
Zhicheng Jiao
Qiguang Miao
36
3
0
27 Feb 2025
ATLAS Navigator: Active Task-driven LAnguage-embedded Gaussian Splatting
Dexter Ong
Yuezhan Tao
Varun Murali
Igor Spasojevic
Vijay R. Kumar
Pratik Chaudhari
3DGS
59
0
0
27 Feb 2025
Multi-Keypoint Affordance Representation for Functional Dexterous Grasping
Fan Yang
DongSheng Luo
Wenrui Chen
Jiacheng Lin
Junjie Cai
Kailun Yang
Z. Li
Yaonan Wang
44
0
0
27 Feb 2025
SegLocNet: Multimodal Localization Network for Autonomous Driving via Bird's-Eye-View Segmentation
Zijie Zhou
Zhangshuo Qi
Luqi Cheng
Guangming Xiong
58
1
0
27 Feb 2025
Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation
Sucheng Ren
Qihang Yu
Ju He
Xiaohui Shen
Alan Yuille
Liang-Chieh Chen
VGen
81
6
0
27 Feb 2025
Vector-Quantized Vision Foundation Models for Object-Centric Learning
Rongzhen Zhao
V. Wang
Juho Kannala
J. Pajarinen
OCL
VLM
169
0
0
27 Feb 2025
New Dataset and Methods for Fine-Grained Compositional Referring Expression Comprehension via Specialist-MLLM Collaboration
X. J. Yang
J. Liu
Peng Wang
Guoqing Wang
Y. Yang
H. Shen
ObjD
79
0
0
27 Feb 2025
MITracker: Multi-View Integration for Visual Object Tracking
Mengjie Xu
Yitao Zhu
Haotian Jiang
Jiaming Li
Zhenrong Shen
...
Haolin Huang
Xinyu Wang
Qing Yang
H. Zhang
Qian Wang
41
0
0
27 Feb 2025
Tell me why: Visual foundation models as self-explainable classifiers
Hugues Turbé
Mina Bjelogrlic
G. Mengaldo
Christian Lovis
61
0
0
26 Feb 2025
GONet: A Generalizable Deep Learning Model for Glaucoma Detection
Or Abramovich
Hadas Pizem
Jonathan Fhima
Eran Berkowitz
Ben Gofrit
...
Meital Baskin
Jan Van Eijgen
Ingeborg Stalmans
E. Blumenthal
Joachim A. Behar
59
1
0
26 Feb 2025
From underwater to aerial: a novel multi-scale knowledge distillation approach for coral reef monitoring
Matteo Contini
Victor Illien
Julien Barde
Sylvain Poulain
Serge Bernard
Alexis Joly
Sylvain Bonhommeau
70
0
0
25 Feb 2025
Escaping The Big Data Paradigm in Self-Supervised Representation Learning
Carlos Vélez García
Miguel Cazorla
Jorge Pomares
49
0
0
25 Feb 2025
What are Foundation Models Cooking in the Post-Soviet World?
Anton Lavrouk
Tarek Naous
Alan Ritter
Wei-ping Xu
63
0
0
25 Feb 2025
Enhancing Reusability of Learned Skills for Robot Manipulation via Gaze and Bottleneck
Ryo Takizawa
Izumi Karino
Koki Nakagawa
Y. Ohmura
Y. Kuniyoshi
77
1
0
25 Feb 2025
PromptMID: Modal Invariant Descriptors Based on Diffusion and Vision Foundation Models for Optical-SAR Image Matching
Han Nie
B. Luo
Jun Liu
Z. Fu
Huan Zhou
Shuo Zhang
Weixing Liu
DiffM
VLM
79
0
0
25 Feb 2025
LAM: Large Avatar Model for One-shot Animatable Gaussian Head
Yisheng He
Xiaodong Gu
Xiaodan Ye
Chao Xu
Zhengyi Zhao
Yuan Dong
Weihao Yuan
Zilong Dong
Liefeng Bo
3DGS
74
0
0
25 Feb 2025
Few-shot Species Range Estimation
Christian Lange
Max Hamilton
Elijah Cole
Alexander Shepard
Samuel Heinrich
Angela Zhu
Subhransu Maji
Grant Van Horn
Oisin Mac Aodha
76
0
0
24 Feb 2025
Vision-LSTM: xLSTM as Generic Vision Backbone
Benedikt Alkin
M. Beck
Korbinian Poppel
Sepp Hochreiter
Johannes Brandstetter
VLM
56
42
0
24 Feb 2025
Continuous Wrist Control on the Hannes Prosthesis: a Vision-based Shared Autonomy Framework
Federico Vasile
Elisa Maiettini
Giulia Pasquale
Nicoló Boccardo
Lorenzo Natale
36
0
0
24 Feb 2025
Disentangling Visual Transformers: Patch-level Interpretability for Image Classification
Guillaume Jeanneret
Loïc Simon
F. Jurie
ViT
44
0
0
24 Feb 2025
Introducing Visual Perception Token into Multimodal Large Language Model
Runpeng Yu
Xinyin Ma
Xinchao Wang
MLLM
LRM
73
0
0
24 Feb 2025
A Pragmatic Note on Evaluating Generative Models with Fréchet Inception Distance for Retinal Image Synthesis
Yuli Wu
Fucheng Liu
Rüveyda Yilmaz
Henning Konermann
Peter Walter
Johannes Stegmaier
EGVM
MedIm
48
1
0
24 Feb 2025
SwimVG: Step-wise Multimodal Fusion and Adaption for Visual Grounding
Liangtao Shi
Ting Liu
Xiantao Hu
Yue Hu
Quanjun Yin
Richang Hong
ObjD
46
0
0
24 Feb 2025
VaViM and VaVAM: Autonomous Driving through Video Generative Modeling
Florent Bartoccioni
Elias Ramzi
Victor Besnier
Shashanka Venkataramanan
Tuan-Hung Vu
...
Mickael Chen
Éloi Zablocki
Andrei Bursuc
Eduardo Valle
Matthieu Cord
VGen
78
1
0
24 Feb 2025
DemoGen: Synthetic Demonstration Generation for Data-Efficient Visuomotor Policy Learning
Zhengrong Xue
Shuying Deng
Zhenyang Chen
Yixuan Wang
Zhecheng Yuan
Huazhe Xu
41
5
0
24 Feb 2025
Enhancing Image Matting in Real-World Scenes with Mask-Guided Iterative Refinement
Rui Liu
39
0
0
24 Feb 2025
Fair Foundation Models for Medical Image Analysis: Challenges and Perspectives
Dilermando Queiroz
Anderson Carlos
André Anjos
Lilian Berton
43
0
0
24 Feb 2025
Unveiling Institution-Specific Bias in Pathology Foundation Models: Detriments, Causes, and Potential Solutions
Weiping Lin
Shen Liu
Runchen Zhu
Liansheng Wang
41
1
0
24 Feb 2025
MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations
Benedikt Alkin
Lukas Miklautz
Sepp Hochreiter
Johannes Brandstetter
VLM
63
8
0
24 Feb 2025
FUNCTO: Function-Centric One-Shot Imitation Learning for Tool Manipulation
Chao Tang
Anxing Xiao
Yuhong Deng
Tianrun Hu
Wenlong Dong
Hanbo Zhang
David Hsu
Hong Zhang
71
2
0
24 Feb 2025
Dragen3D: Multiview Geometry Consistent 3D Gaussian Generation with Drag-Based Control
Jinbo Yan
Alan Zhao
Yixin Hu
3DGS
127
0
0
23 Feb 2025
SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition
Feng Lu
Tong Jin
X. Lan
Lijun Zhang
Yunpeng Liu
Yaowei Wang
Chun Yuan
34
0
0
23 Feb 2025
Human2Robot: Learning Robot Actions from Paired Human-Robot Videos
Sicheng Xie
Haidong Cao
Zejia Weng
Zhen Xing
Shiwei Shen
Jiaqi Leng
Xipeng Qiu
Yanwei Fu
Zuxuan Wu
Yu Jiang
49
0
0
23 Feb 2025
Understanding the Emergence of Multimodal Representation Alignment
Megan Tjandrasuwita
Chanakya Ekbote
Liu Ziyin
Paul Pu Liang
47
1
0
22 Feb 2025
DynamicGSG: Dynamic 3D Gaussian Scene Graphs for Environment Adaptation
Luzhou Ge
Xiangyu Zhu
Zhuo Yang
Xuesong Li
3DGS
70
0
0
21 Feb 2025
Structurally Disentangled Feature Fields Distillation for 3D Understanding and Editing
Yoel Levy
David Shavin
Itai Lang
Sagie Benaim
83
0
0
21 Feb 2025
Data Attribution for Text-to-Image Models by Unlearning Synthesized Images
Sheng-Yu Wang
Aaron Hertzmann
Alexei A. Efros
Jun-Yan Zhu
Richard Zhang
TDI
126
2
0
21 Feb 2025
Textured 3D Regenerative Morphing with 3D Diffusion Prior
Songlin Yang
Yushi Lan
Honghua Chen
Xingang Pan
DiffM
61
0
0
21 Feb 2025
Previous
1
2
3
...
10
11
12
...
42
43
44
Next