Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,189 papers shown
Title
Object-Centric Pretraining via Target Encoder Bootstrapping
Nikola Đukić
Tim Lebailly
Tinne Tuytelaars
OCL
66
0
0
19 Mar 2025
pFedFair: Towards Optimal Group Fairness-Accuracy Trade-off in Heterogeneous Federated Learning
Haoyu Lei
Shizhan Gong
Qi Dou
Farzan Farnia
FedML
59
0
0
19 Mar 2025
TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning in Text-to-Image Models
Teng-Fang Hsiao
Bo-Kai Ruan
Yi-Lun Wu
Tzu-Ling Lin
Hong-Han Shuai
VLM
48
0
0
19 Mar 2025
When Domain Generalization meets Generalized Category Discovery: An Adaptive Task-Arithmetic Driven Approach
Vaibhav Rathore
S. Bagchi
Saikat Dutta
Sarthak Mehrotra
Zsolt Kira
Biplab Banerjee
OOD
74
1
0
19 Mar 2025
Visual Persona: Foundation Model for Full-Body Human Customization
Jisu Nam
Soowon Son
Zhan Xu
Jing Shi
Difan Liu
Feng Liu
Aashish Misraa
Seungryong Kim
Yang Zhou
DiffM
39
0
0
19 Mar 2025
Conjuring Positive Pairs for Efficient Unification of Representation Learning and Image Synthesis
Imanol G. Estepa
Jesús M. Rodríguez-de-Vera
Ignacio Sarasúa
Bhalaji Nagarajan
P. Radeva
52
0
0
19 Mar 2025
Distilling 3D distinctive local descriptors for 6D pose estimation
Amir Hamza
Andrea Caraffa
Davide Boscaini
Fabio Poiesi
44
0
0
19 Mar 2025
TULIP: Towards Unified Language-Image Pretraining
Zineng Tang
Long Lian
Seun Eisape
Xudong Wang
Roei Herzig
Adam Yala
Alane Suhr
Trevor Darrell
David M. Chan
VLM
CLIP
MLLM
95
3
0
19 Mar 2025
SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models
Subhadeep Koley
Tapas Kumar Dutta
Aneeshan Sain
Pinaki Nath Chowdhury
A. Bhunia
Yi-Zhe Song
VLM
66
0
0
18 Mar 2025
RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images
Junjin Xiao
Qing Zhang
Yonewei Nie
Lei Zhu
Wei-Shi Zheng
3DGS
79
0
0
18 Mar 2025
RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment
Chao Wang
Giulio Franzese
A. Finamore
Pietro Michiardi
64
0
0
18 Mar 2025
Learning Shape-Independent Transformation via Spherical Representations for Category-Level Object Pose Estimation
Huan Ren
Wenfei Yang
Xiang Liu
Shifeng Zhang
Tianzhu Zhang
67
2
0
18 Mar 2025
Where do Large Vision-Language Models Look at when Answering Questions?
X. Xing
Chia-Wen Kuo
Li Fuxin
Yulei Niu
Fan Chen
Ming Li
Ying Wu
Longyin Wen
Sijie Zhu
LRM
58
0
0
18 Mar 2025
RoMedFormer: A Rotary-Embedding Transformer Foundation Model for 3D Genito-Pelvic Structure Segmentation in MRI and CT
Yuheng Li
Mingzhe Hu
Richard L. J. Qiu
Maria Thor
Andre Williams
Deborah Marshall
Xiaofeng Yang
MedIm
62
0
0
18 Mar 2025
Utilization of Neighbor Information for Image Classification with Different Levels of Supervision
Gihan Jayatilaka
Abhinav Shrivastava
M. Gwilliam
59
0
0
18 Mar 2025
Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting
Runsong Zhu
Shi Qiu
Zhengzhe Liu
Ka-Hei Hui
Qianyi Wu
Pheng Ann Heng
Chi-Wing Fu
3DGS
3DV
88
1
0
18 Mar 2025
Text-Guided Image Invariant Feature Learning for Robust Image Watermarking
Muhammad Ahtesham
Xin Zhong
52
0
0
18 Mar 2025
Deeply Supervised Flow-Based Generative Models
Inkyu Shin
Chenglin Yang
Liang-Chieh Chen
58
0
0
18 Mar 2025
MeshFleet: Filtered and Annotated 3D Vehicle Dataset for Domain Specific Generative Modeling
Damian Boborzi
Phillip Mueller
Jonas Emrich
Dominik Schmid
Sebastian Mueller
Lars Mikelsons
DiffM
67
0
0
18 Mar 2025
An interpretable approach to automating the assessment of biofouling in video footage
Evelyn J. Mannix
Bartholomew A. Woodham
56
0
0
17 Mar 2025
DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction
Rui Wang
Q. Lohmeyer
Mirko Meboldt
Siyu Tang
3DGS
59
0
0
17 Mar 2025
8-Calves Image dataset
Xuyang Fang
S. Hannuna
Neill D. F. Campbell
89
0
0
17 Mar 2025
PASTA: Part-Aware Sketch-to-3D Shape Generation with Text-Aligned Prior
S. Lee
Hwanhee Jung
Byoungsoo Koh
Qixing Huang
Sangho Yoon
Sangpil Kim
44
0
0
17 Mar 2025
Learning-based 3D Reconstruction in Autonomous Driving: A Comprehensive Survey
Liewen Liao
Weihao Yan
Ming Yang
Songan Zhang
3DV
86
0
0
17 Mar 2025
Towards Scalable Foundation Model for Multi-modal and Hyperspectral Geospatial Data
Haozhe Si
Yuxuan Wan
Minh Do
Deepak Vasisht
Han Zhao
Hendrik Hamann
41
0
0
17 Mar 2025
ASMR: Adaptive Skeleton-Mesh Rigging and Skinning via 2D Generative Prior
Seokhyeon Hong
Soojin Choi
Chaelin Kim
Sihun Cha
Junyong Noh
3DH
55
0
0
17 Mar 2025
E-Values Expand the Scope of Conformal Prediction
Etienne Gauthier
Francis Bach
Michael I. Jordan
42
1
0
17 Mar 2025
MTGS: Multi-Traversal Gaussian Splatting
Tianyu Li
Yihang Qiu
Zhenhua Wu
Carl Lindström
Peng Su
Matthias Nießner
Hongyang Li
3DGS
62
0
0
16 Mar 2025
Multi Activity Sequence Alignment via Implicit Clustering
Taein Kwon
Zador Pataki
Mahdi Rad
Marc Pollefeys
HAI
AI4TS
60
0
0
16 Mar 2025
VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction
Zijian He
Yuwei Ning
Yipeng Qin
Wangrun Wang
Sibei Yang
Liang Lin
G. Li
57
1
0
15 Mar 2025
Snapmoji: Instant Generation of Animatable Dual-Stylized Avatars
Eric M. Chen
Di Liu
Sizhuo Ma
Michael Vasilkovsky
Bing Zhou
...
W. Wang
Jiahao Luo
Dimitris N. Metaxas
Vincent Sitzmann
Jian Wang
3DGS
52
0
0
15 Mar 2025
Provenance Detection for AI-Generated Images: Combining Perceptual Hashing, Homomorphic Encryption, and AI Detection Models
Shree Singhi
Aayan Yadav
Aayush Gupta
Shariar Ebrahimi
Parisa Hassanizadeh
36
0
0
14 Mar 2025
VGGT: Visual Geometry Grounded Transformer
Jianyuan Wang
Minghao Chen
Nikita Karaev
Andrea Vedaldi
Christian Rupprecht
David Novotny
ViT
48
6
0
14 Mar 2025
Unlocking Open-Set Language Accessibility in Vision Models
Fawaz Sammani
Jonas Fischer
Nikos Deligiannis
VLM
53
0
0
14 Mar 2025
Self-Supervised Pretraining for Fine-Grained Plankton Recognition
Joona Kareinen
T. Eerola
K. Kraft
L. Lensu
S. Suikkanen
H. Kalviainen
SSL
108
0
0
14 Mar 2025
Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation
Yifan Xie
Binkai Ou
Fei Ma
Yaohua Liu
42
0
0
14 Mar 2025
Multi-View Industrial Anomaly Detection with Epipolar Constrained Cross-View Fusion
Yifan Liu
Xun Xu
Shijie Li
Jingyi Liao
Xulei Yang
41
0
0
14 Mar 2025
EgoSplat: Open-Vocabulary Egocentric Scene Understanding with Language Embedded 3D Gaussian Splatting
Di Li
Jie Feng
Jiahao Chen
Weisheng Dong
Guanbin Li
G. Shi
Licheng Jiao
3DGS
VLM
109
0
0
14 Mar 2025
APLA: A Simple Adaptation Method for Vision Transformers
Moein Sorkhei
Emir Konuk
Kevin Smith
Christos Matsoukas
53
0
0
14 Mar 2025
AugGen: Synthetic Augmentation Can Improve Discriminative Models
Parsa Rahimi
Damien Teney
S´ebastien Marcel
64
0
0
14 Mar 2025
Towards a Unified Copernicus Foundation Model for Earth Vision
Yi Wang
Zhitong Xiong
Chenying Liu
Adam J. Stewart
Thomas Dujardin
...
Angelos Zavras
Franziska Gerken
Ioannis Papoutsis
Laura Leal-Taixé
Xiao Xiang Zhu
44
1
0
14 Mar 2025
Panopticon: Advancing Any-Sensor Foundation Models for Earth Observation
Leonard Waldmann
Ando Shah
Yi Wang
Nils Lehmann
Adam J. Stewart
Zhitong Xiong
Xiao Xiang Zhu
Stefan Bauer
John Chuang
41
1
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
64
6
0
13 Mar 2025
OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
Yongsheng Yu
Ziyun Zeng
Haitian Zheng
Jiebo Luo
DiffM
59
0
0
13 Mar 2025
SmartWay: Enhanced Waypoint Prediction and Backtracking for Zero-Shot Vision-and-Language Navigation
Xiangyu Shi
Zerui Li
Wenqi Lyu
Jiatong Xia
Feras Dayoub
Yanyuan Qiao
Qi Wu
46
0
0
13 Mar 2025
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
Xiaoming Zhao
Alexander Schwing
FaML
63
0
0
13 Mar 2025
One-Shot Federated Unsupervised Domain Adaptation with Scaled Entropy Attention and Multi-Source Smoothed Pseudo Labeling
Ali Abedi
Qiang Wu
Ning Zhang
Farhad Pourpanah
FedML
63
0
0
13 Mar 2025
Interpretable Image Classification via Non-parametric Part Prototype Learning
Zhijie Zhu
Lei Fan
M. Pagnucco
Yang Song
39
0
0
13 Mar 2025
Towards Fast, Memory-based and Data-Efficient Vision-Language Policy
Haoxuan Li
Sixu Yan
Y. Li
Xinggang Wang
LM&Ro
59
0
0
13 Mar 2025
RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models
Yijing Lin
Mengqi Huang
Shuhan Zhuang
Zhendong Mao
VGen
43
0
0
13 Mar 2025
Previous
1
2
3
...
7
8
9
...
42
43
44
Next