Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,169 papers shown
Title
SoccerDiffusion: Toward Learning End-to-End Humanoid Robot Soccer from Gameplay Recordings
Florian Vahl
Jörn Griepenburg
Jan Gutsche
Jasper Güldenstein
Jianwei Zhang
VGen
39
0
0
29 Apr 2025
X-Fusion: Introducing New Modality to Frozen Large Language Models
Sicheng Mo
Thao Nguyen
Xun Huang
Siddharth Srinivasan Iyer
Yijun Li
...
Eli Shechtman
Krishna Kumar Singh
Yong Jae Lee
Bolei Zhou
Yuheng Li
71
0
0
29 Apr 2025
GarmentX: Autoregressive Parametric Representations for High-Fidelity 3D Garment Generation
Jingfeng Guo
J. Chen
Weikai Chen
Zhenyu Sun
Lanjiong Li
Baozhu Zhao
Lingting Zhu
X. Wang
Qi Liu
3DH
80
0
0
29 Apr 2025
In-Context Edit: Enabling Instructional Image Editing with In-Context Generation in Large Scale Diffusion Transformer
Zechuan Zhang
Ji Xie
Yu Lu
Zongxin Yang
Y. Yang
DiffM
89
1
0
29 Apr 2025
Creating Your Editable 3D Photorealistic Avatar with Tetrahedron-constrained Gaussian Splatting
Hanxi Liu
Yifang Men
Zhouhui Lian
3DGS
33
0
0
29 Apr 2025
Do You Know the Way? Human-in-the-Loop Understanding for Fast Traversability Estimation in Mobile Robotics
Andre Schreiber
Katherine Rose Driggs-Campbell
63
0
0
28 Apr 2025
CompleteMe: Reference-based Human Image Completion
Yu-Ju Tsai
Brian L. Price
Qing Liu
Luis Figueroa
D. Pakhomov
Zhihong Ding
Scott D. Cohen
Ming Yang
3DH
47
0
0
28 Apr 2025
NORA: A Small Open-Sourced Generalist Vision Language Action Model for Embodied Tasks
Chia-Yu Hung
Qi Sun
Pengfei Hong
Amir Zadeh
Chuan Li
U-Xuan Tan
Navonil Majumder
Soujanya Poria
LM&Ro
37
1
0
28 Apr 2025
Prisma: An Open Source Toolkit for Mechanistic Interpretability in Vision and Video
Sonia Joseph
Praneet Suresh
Lorenz Hufe
Edward Stevinson
Robert Graham
Yash Vadi
Danilo Bzdok
Sebastian Lapuschkin
Lee Sharkey
Blake A. Richards
72
0
0
28 Apr 2025
PhenoAssistant: A Conversational Multi-Agent AI System for Automated Plant Phenotyping
Feng Chen
Ilias Stogiannidis
Andrew Wood
Danilo Bueno
Dominic Williams
...
Stephen A. Rolfe
Tracy Lawson
Tony Pridmore
M. Giuffrida
Sotirios A. Tsaftaris
62
0
0
28 Apr 2025
EarthMapper: Visual Autoregressive Models for Controllable Bidirectional Satellite-Map Translation
Zhe Dong
Yuzhe Sun
Tianzhu Liu
Wangmeng Zuo
Yanfeng Gu
48
0
0
28 Apr 2025
Pixels2Points: Fusing 2D and 3D Features for Facial Skin Segmentation
Victoria Yue Chen
Daoye Wang
Stephan Garbin
Jan Bednarík
Sebastian Winberg
Timo Bolkart
Thabo Beeler
3DH
3DPC
34
0
0
28 Apr 2025
CLR-Wire: Towards Continuous Latent Representations for 3D Curve Wireframe Generation
Xueqi Ma
Y. Liu
Tianlong Gao
Q. Huang
Hui Huang
3DV
AI4CE
37
0
0
27 Apr 2025
CARL: Camera-Agnostic Representation Learning for Spectral Image Analysis
Alexander Baumann
Leonardo Ayala
S.
Jan Sellner
Alexander Studier-Fischer
Berkin Özdemir
Lena Maier-Hein
Slobodan Ilic
51
0
0
27 Apr 2025
OpenFusion++: An Open-vocabulary Real-time Scene Understanding System
Xiaofeng Jin
Matteo Frosi
Matteo Matteucci
62
0
0
27 Apr 2025
Multi-Stage Boundary-Aware Transformer Network for Action Segmentation in Untrimmed Surgical Videos
Rezowan Shuvo
M S Mekala
Eyad Elyan
MedIm
48
0
0
26 Apr 2025
Eval3D: Interpretable and Fine-grained Evaluation for 3D Generation
Shivam Duggal
Yushi Hu
Oscar Michel
Aniruddha Kembhavi
William T. Freeman
Noah A. Smith
Ranjay Krishna
Antonio Torralba
Ali Farhadi
Wei-Chiu Ma
EGVM
ELM
70
0
0
25 Apr 2025
E-InMeMo: Enhanced Prompting for Visual In-Context Learning
Jiahao Zhang
Bowen Wang
Hong Liu
Liangzhi Li
Yuta Nakashima
Hajime Nagahara
VLM
99
0
0
25 Apr 2025
SSL4Eco: A Global Seasonal Dataset for Geospatial Foundation Models in Ecology
Elena Plekhanova
Damien Robert
Johannes Dollinger
Emilia Arens
Philipp Brun
Jan Dirk Wegner
Niklaus Zimmermann
19
0
0
25 Apr 2025
What is the Added Value of UDA in the VFM Era?
B. B. Englert
Tommie Kerssies
Gijs Dubbelman
37
0
0
25 Apr 2025
Improving Open-World Object Localization by Discovering Background
Ashish Singh
Michael J. Jones
Kuan-Chuan Peng
A. Cherian
Moitreya Chatterjee
Erik Learned-Miller
ObjD
OCL
VLM
64
0
0
24 Apr 2025
Lessons from Deploying Learning-based CSI Localization on a Large-Scale ISAC Platform
Tianyu Zhang
Dongheng Zhang
Ruixu Geng
Xuecheng Xie
Shuai Yang
Yan Chen
34
0
0
24 Apr 2025
The Fourth Monocular Depth Estimation Challenge
Anton Obukhov
Matteo Poggi
Fabio Tosi
Ripudaman Singh Arora
Jaime Spencer
...
Tuan-Anh Yang
Minh-Quang Nguyen
T. Tran
Albert Luginov
Muhammad Shahzad
MDE
46
0
0
24 Apr 2025
Predict-Optimize-Distill: A Self-Improving Cycle for 4D Object Understanding
Mingxuan Wu
Huang Huang
Justin Kerr
C. Kim
Anthony Zhang
Brent Yi
Angjoo Kanazawa
48
0
0
24 Apr 2025
Dargana: fine-tuning EarthPT for dynamic tree canopy mapping from space
Michael J. Smith
Luke Fleming
James E. Geach
Ryan J. Roberts
Freddie Kalaitzis
James Banister
24
0
0
24 Apr 2025
DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs
Z. Wang
Senthil Purushwalkam
Caiming Xiong
S.
Heng Ji
R. Xu
38
0
0
23 Apr 2025
V
2
^2
2
R-Bench: Holistically Evaluating LVLM Robustness to Fundamental Visual Variations
Zhiyuan Fan
Yumeng Wang
Sandeep Polisetty
Yi Ren Fung
43
0
0
23 Apr 2025
ManipDreamer: Boosting Robotic Manipulation World Model with Action Tree and Visual Guidance
Ying Li
Xiaobao Wei
Xiaowei Chi
Y. K. Li
Zhongyu Zhao
Hao Wang
Ningning MA
Ming Lu
Shanghang Zhang
VGen
39
0
0
23 Apr 2025
SmallGS: Gaussian Splatting-based Camera Pose Estimation for Small-Baseline Videos
Yuxin Yao
Yan Zhang
Zhening Huang
Joan Lasenby
3DGS
19
0
0
22 Apr 2025
DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining
Wei Zhuo
Zhiyue Tang
Wufeng Xue
Hao Ding
Linlin Shen
25
0
0
22 Apr 2025
AffordanceSAM: Segment Anything Once More in Affordance Grounding
D. Jiang
Mengmeng Wang
Teli Ma
H. Li
Y. Liu
Guang Dai
L. Zhang
32
0
0
22 Apr 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
26
0
0
22 Apr 2025
FreeGraftor: Training-Free Cross-Image Feature Grafting for Subject-Driven Text-to-Image Generation
Zebin Yao
Lei Ren
Huixing Jiang
Chen Wei
Xiaojie Wang
Ruifan Li
Fangxiang Feng
DiffM
69
0
0
22 Apr 2025
MonoTher-Depth: Enhancing Thermal Depth Estimation via Confidence-Aware Distillation
Xingxing Zuo
Nikhil Ranganathan
Connor T. Lee
Georgia Gkioxari
Soon-Jo Chung
VLM
51
1
0
21 Apr 2025
Insert Anything: Image Insertion via In-Context Editing in DiT
Wensong Song
Hong Jiang
Zongxing Yang
Ruijie Quan
Yi Yang
DiffM
40
0
0
21 Apr 2025
Instance-Adaptive Keypoint Learning with Local-to-Global Geometric Aggregation for Category-Level Object Pose Estimation
X. Zhang
Lu Zou
Tao Lu
Yuan Yao
Zhangjin Huang
Guoping Wang
3DPC
28
0
0
21 Apr 2025
Context Aware Grounded Teacher for Source Free Object Detection
Tajamul Ashraf
Rajes Manna
Partha Sarathi Purkayastha
Tavaheed Tariq
Janibul Bashir
25
0
0
21 Apr 2025
NTIRE 2025 Challenge on Real-World Face Restoration: Methods and Results
Zheng Chen
J. Wang
Kai Liu
Jue Gong
Lei Sun
...
J. Lee
C. Lee
Chih-Chung Hsu
Hu Peng
Chunming He
56
2
0
20 Apr 2025
Vision-Centric Representation-Efficient Fine-Tuning for Robust Universal Foreground Segmentation
Guoyi Zhang
Siyang Chen
Guangsheng Xu
Han Wang
Xiaohu Zhang
29
0
0
20 Apr 2025
Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark
Enxin Song
Wenhao Chai
Weili Xu
Jianwen Xie
Yuxuan Liu
Gaoang Wang
57
0
0
20 Apr 2025
Seurat: From Moving Points to Depth
Seokju Cho
Jiahui Huang
S. Kim
Joon-Young Lee
3DPC
MDE
29
0
0
20 Apr 2025
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul Mcvay
Ada Martin
Arjun Majumdar
Krishna Murthy Jatavallabhula
...
Nicolas Ballas
Mido Assran
Oleksandr Maksymets
Aravind Rajeswaran
Franziska Meier
3DPC
41
0
0
19 Apr 2025
Exploring Modality Guidance to Enhance VFM-based Feature Fusion for UDA in 3D Semantic Segmentation
Johannes Spoecklberger
W. Lin
Pedro Hermosilla
Sivan Doveh
Horst Possegger
M. Jehanzeb Mirza
17
0
0
19 Apr 2025
Learning Joint ID-Textual Representation for ID-Preserving Image Synthesis
Zichuan Liu
Liming Jiang
Qing Yan
Yumin Jia
Hao Kang
Xin Lu
DiffM
29
0
0
19 Apr 2025
Mono3R: Exploiting Monocular Cues for Geometric 3D Reconstruction
Wenyu Li
Sidun Liu
Peng Qiao
Yong Dou
25
0
0
18 Apr 2025
LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models
Haiwen Huang
Anpei Chen
Volodymyr Havrylov
Andreas Geiger
Dan Zhang
27
1
0
18 Apr 2025
Entropic Time Schedulers for Generative Diffusion Models
Dejan Stancevic
Luca Ambrogioni
DiffM
OOD
38
0
0
18 Apr 2025
BeetleVerse: A study on taxonomic classification of ground beetles
S M Rayeed
Alyson East
Samuel Stevens
Sydne Record
Charles V. Stewart
21
0
0
18 Apr 2025
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
Yang Yue
Yulin Wang
Chenxin Tao
Pan Liu
Shiji Song
Gao Huang
MedIm
24
0
0
18 Apr 2025
EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
Yang Yue
Yulin Wang
Haojun Jiang
Pan Liu
S. Song
Gao Huang
VGen
27
0
0
17 Apr 2025
Previous
1
2
3
4
5
...
42
43
44
Next