ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.07193
  4. Cited By
DINOv2: Learning Robust Visual Features without Supervision

DINOv2: Learning Robust Visual Features without Supervision

14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
    VLM
    CLIP
    SSL
ArXivPDFHTML

Papers citing "DINOv2: Learning Robust Visual Features without Supervision"

50 / 2,189 papers shown
Title
FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For
  Anomaly Segmentation
FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation
Chang Won Lee
Selina Leveugle
Svetlana Stolpner
Chris Langley
Paul Grouchy
Jonathan Kelly
Steven Waslander
77
0
0
29 Nov 2024
CogACT: A Foundational Vision-Language-Action Model for Synergizing
  Cognition and Action in Robotic Manipulation
CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Qixiu Li
Yaobo Liang
Zeyu Wang
Lin Luo
Xi Chen
...
Jianmin Bao
Dong Chen
Yuanchun Shi
Jiaolong Yang
B. Guo
LM&Ro
83
23
0
29 Nov 2024
FairDD: Fair Dataset Distillation via Synchronized Matching
FairDD: Fair Dataset Distillation via Synchronized Matching
Qihang Zhou
Shenhao Fang
Shibo He
Wenchao Meng
Jiming Chen
FedML
DD
79
1
0
29 Nov 2024
Curriculum Fine-tuning of Vision Foundation Model for Medical Image Classification Under Label Noise
Yeonguk Yu
Minhwan Ko
Sungho Shin
Kangmin Kim
K. Lee
NoLa
79
1
0
29 Nov 2024
Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy
  Morphology Analysis
Effective Fine-Tuning of Vision-Language Models for Accurate Galaxy Morphology Analysis
Ruoqi Wang
Haitao Wang
Qiong Luo
75
0
0
29 Nov 2024
T-3DGS: Removing Transient Objects for 3D Scene Reconstruction
T-3DGS: Removing Transient Objects for 3D Scene Reconstruction
Vadim Pryadilshchikov
Alexander Markin
Artem Komarichev
Ruslan Rakhimov
Peter Wonka
Evgeny Burnaev
3DGS
79
1
0
29 Nov 2024
Explaining the Impact of Training on Vision Models via Activation Clustering
Explaining the Impact of Training on Vision Models via Activation Clustering
Ahcène Boubekki
Samuel G. Fadel
Sebastian Mair
89
0
0
29 Nov 2024
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language
  for Open-Vocabulary Segmentation
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
72
2
0
28 Nov 2024
OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth
  Integration
OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration
Yiming Zuo
Willow Yang
Zeyu Ma
Jia Deng
MDE
85
2
0
28 Nov 2024
Unleashing the Power of Data Synthesis in Visual Localization
Sihang Li
Siqi Tan
Bowen Chang
Jing Zhang
Chen Feng
Yiming Li
88
0
0
28 Nov 2024
Track Anything Behind Everything: Zero-Shot Amodal Video Object
  Segmentation
Track Anything Behind Everything: Zero-Shot Amodal Video Object Segmentation
Finlay G. C. Hudson
W. Smith
VOS
VLM
76
0
0
28 Nov 2024
ETSM: Automating Dissection Trajectory Suggestion and Confidence
  Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal
  Dissection
ETSM: Automating Dissection Trajectory Suggestion and Confidence Map-Based Safety Margin Prediction for Robot-assisted Endoscopic Submucosal Dissection
Mengya Xu
Wenjin Mo
Guankun Wang
Huxin Gao
An-Chi Wang
Long Bai
Chaoyang Lyu
Xiaoxiao Yang
Z. Li
Hongliang Ren
79
0
0
28 Nov 2024
Any-Resolution AI-Generated Image Detection by Spectral Learning
Any-Resolution AI-Generated Image Detection by Spectral Learning
Dimitrios Karageorgiou
Symeon Papadopoulos
I. Kompatsiaris
Efstratios Gavves
103
0
0
28 Nov 2024
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
TAMT: Temporal-Aware Model Tuning for Cross-Domain Few-Shot Action Recognition
Yilong Wang
Zilin Gao
Qilong Wang
Zhaofeng Chen
P. Li
Q. Hu
80
1
0
28 Nov 2024
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for
  Robust 3D Robotic Manipulation
Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation
Yueru Jia
Jiaming Liu
Sixiang Chen
Chenyang Gu
Z. Wang
...
Lily Lee
Pengwei Wang
Zhongyuan Wang
Renrui Zhang
Shanghang Zhang
89
11
0
27 Nov 2024
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video
  Comprehension with Video-Text Duet Interaction Format
VideoLLM Knows When to Speak: Enhancing Time-Sensitive Video Comprehension with Video-Text Duet Interaction Format
Yueqian Wang
Xiaojun Meng
Y. Wang
Jianxin Liang
Jiansheng Wei
Huishuai Zhang
Dongyan Zhao
VGen
83
8
0
27 Nov 2024
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
Tianxing Chen
Yao Mu
Zhixuan Liang
Z. Chen
Shijia Peng
...
Mingkun Xu
R. Hu
H. Zhang
Xuelong Li
Ping Luo
AI4CE
102
8
0
27 Nov 2024
Flaws of ImageNet, Computer Vision's Favourite Dataset
Flaws of ImageNet, Computer Vision's Favourite Dataset
Nikita Kisel
Illia Volkov
Katerina Hanzelkova
Klara Janouskova
Jirí Matas
VLM
89
1
0
26 Nov 2024
A Distractor-Aware Memory for Visual Object Tracking with SAM2
A Distractor-Aware Memory for Visual Object Tracking with SAM2
Jovana Videnovic
A. Lukežič
Matej Kristan
VLM
86
1
0
26 Nov 2024
Spatially Visual Perception for End-to-End Robotic Learning
Spatially Visual Perception for End-to-End Robotic Learning
Travis Davies
Jiahuan Yan
Xiang Chen
Yu Tian
Yueting Zhuang
Yiqi Huang
Luhui Hu
70
0
0
26 Nov 2024
Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion
  for Sim2Real Transfer
Depth-PC: A Visual Servo Framework Integrated with Cross-Modality Fusion for Sim2Real Transfer
Haoyu Zhang
Weiyang Lin
Yimu Jiang
Chao Ye
73
0
0
26 Nov 2024
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D
  Generation
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation
Xiang Li
Zixuan Huang
Anh Thai
James M. Rehg
3DGS
77
0
0
26 Nov 2024
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu
J. Li
Yichen Jiang
Niranjan Sujay
Z. Yang
Juexiao Zhang
John Abanes
Jing Zhang
Chen Feng
112
1
0
26 Nov 2024
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation
Chanyoung Kim
Dayun Ju
Woojung Han
Ming-Hsuan Yang
Seong Jae Hwang
VLM
VOS
79
0
0
26 Nov 2024
Online Episodic Memory Visual Query Localization with Egocentric
  Streaming Object Memory
Online Episodic Memory Visual Query Localization with Egocentric Streaming Object Memory
Zaira Manigrasso
Matteo Dunnhofer
Antonino Furnari
Moritz Nottebaum
Antonio Finocchiaro
Davide Marana
G. Farinella
C. Micheloni
78
1
0
25 Nov 2024
Open Vocabulary Monocular 3D Object Detection
Open Vocabulary Monocular 3D Object Detection
Jin Yao
Hao Gu
Xuweiyi Chen
Jiayun Wang
Zezhou Cheng
ObjD
VLM
71
3
0
25 Nov 2024
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
Diffusion Features for Zero-Shot 6DoF Object Pose Estimation
Bernd Von Gimborn
P. Ausserlechner
Markus Vincze
S. Thalhammer
DiffM
66
0
0
25 Nov 2024
Edge Weight Prediction For Category-Agnostic Pose Estimation
Edge Weight Prediction For Category-Agnostic Pose Estimation
Or Hirschorn
S. Avidan
74
0
0
25 Nov 2024
Efficient Video Face Enhancement with Enhanced Spatial-Temporal
  Consistency
Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency
Y. Wang
Jiajie Teng
Jiajiong Cao
Yuming Li
Chenguang Ma
Hongteng Xu
Dixin Luo
VGen
DiffM
70
0
0
25 Nov 2024
A Study on Unsupervised Domain Adaptation for Semantic Segmentation in
  the Era of Vision-Language Models
A Study on Unsupervised Domain Adaptation for Semantic Segmentation in the Era of Vision-Language Models
Manuel Schwonberg
Claus Werner
Hanno Gottschalk
Carsten Meyer
VLM
90
0
0
25 Nov 2024
Image Generation Diversity Issues and How to Tame Them
Image Generation Diversity Issues and How to Tame Them
Mischa Dombrowski
Weitong Zhang
Sarah Cechnicka
Hadrien Reynaud
Bernhard Kainz
72
0
0
25 Nov 2024
Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment
  Anything Model in Medical Domain
Med-PerSAM: One-Shot Visual Prompt Tuning for Personalized Segment Anything Model in Medical Domain
Hangyul Yoon
Doohyuk Jang
JungEun Kim
Eunho Yang
VLM
MedIm
72
1
0
25 Nov 2024
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu
Gu Wang
Ruida Zhang
Chenyangguang Zhang
F. Tombari
Xiangyang Ji
191
2
0
25 Nov 2024
VideoOrion: Tokenizing Object Dynamics in Videos
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng
Yijiang Li
Wanpeng Zhang
Sipeng Zheng
Zongqing Lu
Sipeng Zheng
Zongqing Lu
109
1
0
25 Nov 2024
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
Yongwei Chen
Yushi Lan
Shangchen Zhou
Tengfei Wang
Xingang Pan
100
5
0
25 Nov 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language
  Inference
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
81
0
0
24 Nov 2024
Medical Slice Transformer: Improved Diagnosis and Explainability on 3D
  Medical Images with DINOv2
Medical Slice Transformer: Improved Diagnosis and Explainability on 3D Medical Images with DINOv2
Gustav Muller-Franzes
Firas Khader
R. Siepmann
T. Han
Jakob Nikolas Kather
S. Nebelung
Daniel Truhn
MedIm
74
0
0
24 Nov 2024
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
PriorDiffusion: Leverage Language Prior in Diffusion Models for Monocular Depth Estimation
Ziyao Zeng
Jingcheng Ni
Daniel Wang
Patrick Rim
Younjoon Chung
Fengyu Yang
Byung-Woo Hong
A. Wong
DiffM
MDE
108
2
0
24 Nov 2024
Training an Open-Vocabulary Monocular 3D Object Detection Model without
  3D Data
Training an Open-Vocabulary Monocular 3D Object Detection Model without 3D Data
Rui Huang
Henry Zheng
Yan Wang
Zhuofan Xia
Marco Pavone
Gao Huang
3DPC
VLM
83
1
0
23 Nov 2024
$\textit{Revelio}$: Interpreting and leveraging semantic information in
  diffusion models
Revelio\textit{Revelio}Revelio: Interpreting and leveraging semantic information in diffusion models
Dahye Kim
Xavier Thomas
Deepti Ghadiyaram
83
4
0
23 Nov 2024
Twin Trigger Generative Networks for Backdoor Attacks against Object
  Detection
Twin Trigger Generative Networks for Backdoor Attacks against Object Detection
Zhiying Li
Zhi Liu
Guanggang Geng
Shreyank N. Gowda
Shuyuan Lin
Jian Weng
Xiaobo Jin
AAML
75
0
0
23 Nov 2024
Zero-Shot Coreset Selection: Efficient Pruning for Unlabeled Data
Zero-Shot Coreset Selection: Efficient Pruning for Unlabeled Data
Brent A. Griffin
Jacob Marks
Jason J. Corso
VLM
74
2
0
22 Nov 2024
There is no SAMantics! Exploring SAM as a Backbone for Visual
  Understanding Tasks
There is no SAMantics! Exploring SAM as a Backbone for Visual Understanding Tasks
Miguel Espinosa
Chenhongyi Yang
Linus Ericsson
Steven G. McDonagh
Elliot J. Crowley
VLM
70
0
0
22 Nov 2024
Design-o-meter: Towards Evaluating and Refining Graphic Designs
Design-o-meter: Towards Evaluating and Refining Graphic Designs
Sahil Goyal
Abhinav Mahajan
Swasti Mishra
Prateksha Udhayanan
Tripti Shukla
K. J. Joseph
Balaji Vasan Srinivasan
77
1
0
22 Nov 2024
RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency
RankByGene: Gene-Guided Histopathology Representation Learning Through Cross-Modal Ranking Consistency
Wentao Huang
Meilong Xu
Xiaoling Hu
Shahira Abousamra
Aniruddha Ganguly
...
Prateek Prasanna
Tahsin M. Kurc
Joel H. Saltz
Michael L. Miller
C. L. P. Chen
78
0
0
22 Nov 2024
Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via
  Class Region Proposals
Segment Any Class (SAC): Multi-Class Few-Shot Semantic Segmentation via Class Region Proposals
Hussni Mohd Zakir
Eric Tatt Wei Ho
VLM
72
0
0
21 Nov 2024
NexusSplats: Efficient 3D Gaussian Splatting in the Wild
NexusSplats: Efficient 3D Gaussian Splatting in the Wild
Yuzhou Tang
Dejun Xu
Yongjie Hou
Zhenzhong Wang
Min Jiang
3DGS
76
1
0
21 Nov 2024
HF-Diff: High-Frequency Perceptual Loss and Distribution Matching for
  One-Step Diffusion-Based Image Super-Resolution
HF-Diff: High-Frequency Perceptual Loss and Distribution Matching for One-Step Diffusion-Based Image Super-Resolution
S. Sami
Md Golam Moula Mehedi Hasan
J. Dawson
Nasser M. Nasrabadi
DiffM
71
0
0
20 Nov 2024
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic
  Segmentation
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation
Ziyi Wang
Y. Wang
Xumin Yu
Jie Zhou
Jiwen Lu
74
0
0
20 Nov 2024
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image
  Generation
RAW-Diffusion: RGB-Guided Diffusion Models for High-Fidelity RAW Image Generation
Christoph Reinders
Radu Berdan
Beril Besbinar
Junji Otsuka
Daisuke Iso
81
2
0
20 Nov 2024
Previous
123...141516...424344
Next