Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.07193
Cited By
DINOv2: Learning Robust Visual Features without Supervision
14 April 2023
Maxime Oquab
Timothée Darcet
Théo Moutakanni
Huy Q. Vo
Marc Szafraniec
Vasil Khalidov
Pierre Fernandez
Daniel Haziza
Francisco Massa
Alaaeldin El-Nouby
Mahmoud Assran
Nicolas Ballas
Wojciech Galuba
Russ Howes
Po-Yao (Bernie) Huang
Shang-Wen Li
Ishan Misra
Michael G. Rabbat
Vasu Sharma
Gabriel Synnaeve
Huijiao Xu
Hervé Jégou
Julien Mairal
Patrick Labatut
Armand Joulin
Piotr Bojanowski
VLM
CLIP
SSL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DINOv2: Learning Robust Visual Features without Supervision"
50 / 2,189 papers shown
Title
Interpretable Image Classification via Non-parametric Part Prototype Learning
Zhijie Zhu
Lei Fan
M. Pagnucco
Yang Song
39
0
0
13 Mar 2025
HybridVLA: Collaborative Diffusion and Autoregression in a Unified Vision-Language-Action Model
Jiaming Liu
Hao Chen
Pengju An
Zhuoyang Liu
Renrui Zhang
...
Chengkai Hou
Mengdi Zhao
KC alex Zhou
Pheng-Ann Heng
S. Zhang
64
7
0
13 Mar 2025
The Power of One: A Single Example is All it Takes for Segmentation in VLMs
Mir Rayat Imtiaz Hossain
Mennatullah Siam
Leonid Sigal
James J. Little
MLLM
VLM
72
0
0
13 Mar 2025
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer
Yury Belousov
S. Voloshynovskiy
AAML
37
0
0
13 Mar 2025
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models
Sangwon Jang
June Suk Choi
Jaehyeong Jo
Kimin Lee
Sung Ju Hwang
DiffM
WIGM
79
1
0
12 Mar 2025
Efficient Alignment of Unconditioned Action Prior for Language-conditioned Pick and Place in Clutter
Kechun Xu
Xunlong Xia
Kaixuan Wang
Yifei Yang
Yunxuan Mao
Bing Deng
R. Xiong
Y. Wang
OffRL
64
0
0
12 Mar 2025
Online Language Splatting
Saimouli Katragadda
Cho-Ying Wu
Yuliang Guo
Xinyu Huang
G. Huang
Liu Ren
3DGS
OffRL
60
0
0
12 Mar 2025
SDD-4DGS: Static-Dynamic Aware Decoupling in Gaussian Splatting for 4D Scene Reconstruction
Dai Sun
Huhao Guan
Kun Zhang
Xike Xie
S.Kevin Zhou
3DGS
58
0
0
12 Mar 2025
SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation
Junsong Chen
Shuchen Xue
Yuyang Zhao
Jincheng Yu
Sayak Paul
Junyu Chen
Han Cai
E. Xie
Song Han
VLM
64
2
0
12 Mar 2025
Towards Robust Multimodal Representation: A Unified Approach with Adaptive Experts and Alignment
Nazanin Moradinasab
S. Sengupta
Jiebei Liu
Sana Syed
Donald Brown
58
0
0
12 Mar 2025
Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection
Yucheng Suo
Fan Ma
Kaixin Shen
Linchao Zhu
Yi Yang
VLM
47
0
0
12 Mar 2025
Isolated Channel Vision Transformers: From Single-Channel Pretraining to Multi-Channel Finetuning
Wenyi Lian
Joakim Lindblad
Patrick Micke
Natasa Sladoje
57
0
0
12 Mar 2025
Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model
Ali Vosoughi
Dimitra Emmanouilidou
H. Gamper
53
0
0
12 Mar 2025
DRESS: Disentangled Representation-based Self-Supervised Meta-Learning for Diverse Tasks
Wei Cui
Tongzi Wu
Jesse C. Cresswell
Yi Sui
Keyvan Golestan
60
0
0
12 Mar 2025
Close-up-GS: Enhancing Close-Up View Synthesis in 3D Gaussian Splatting with Progressive Self-Training
Jiatong Xia
Lingqiao Liu
3DGS
58
0
0
12 Mar 2025
CleverDistiller: Simple and Spatially Consistent Cross-modal Distillation
Hariprasath Govindarajan
Maciej K. Wozniak
Marvin Klingner
Camille Maurice
B. R. Kiran
S. Yogamani
53
0
0
12 Mar 2025
Discovering Influential Neuron Path in Vision Transformers
Yifan Wang
Yifei Liu
Yingdong Shi
C. Li
Anqi Pang
Sibei Yang
Jingyi Yu
Kan Ren
ViT
69
0
0
12 Mar 2025
Object-Aware DINO (Oh-A-Dino): Enhancing Self-Supervised Representations for Multi-Object Instance Retrieval
Stefan Sylvius Wagner
Stefan Harmeling
OCL
68
0
0
12 Mar 2025
InteractEdit: Zero-Shot Editing of Human-Object Interactions in Images
Jiun Tian Hoe
Weipeng Hu
Wei Zhou
Chao Xie
Ziwei Wang
Chee Seng Chan
Xudong Jiang
Y. Tan
61
0
0
12 Mar 2025
Exploring Position Encoding in Diffusion U-Net for Training-free High-resolution Image Generation
Feng Zhou
Pu Cao
Yiyang Ma
Lu Yang
Jianqin Yin
DiffM
46
0
0
12 Mar 2025
CQVPR: Landmark-aware Contextual Queries for Visual Place Recognition
Dongyue Li
Daisuke Deguchi
Hiroshi Murase
58
0
0
11 Mar 2025
MsaMIL-Net: An End-to-End Multi-Scale Aware Multiple Instance Learning Network for Efficient Whole Slide Image Classification
Jiangping Wen
Jinyu Wen
Meie Fang
48
0
0
11 Mar 2025
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Hesen Chen
Junyan Wang
Zhiyu Tan
Hao Li
53
0
0
11 Mar 2025
MGHanD: Multi-modal Guidance for authentic Hand Diffusion
Taehyeon Eum
Jieun Choi
Tae-Kyun Kim
38
0
0
11 Mar 2025
Enhancing Sentiment Analysis through Multimodal Fusion: A BERT-DINOv2 Approach
Taoxu Zhao
Meisi Li
Kehao Chen
Liye Wang
Xucheng Zhou
Kunal Chaturvedi
Mukesh Prasad
Ali Anaissi
Ali Braytee
53
0
0
11 Mar 2025
"Principal Components" Enable A New Language of Images
Xin Wen
Bingchen Zhao
Ismail Elezi
Jiankang Deng
Xiaojuan Qi
59
0
0
11 Mar 2025
WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images
Yansong Guo
Jie Hu
Yansong Qu
Liujuan Cao
3DGS
119
0
0
11 Mar 2025
FP3: A 3D Foundation Policy for Robotic Manipulation
Rujia Yang
Geng Chen
Chuan Wen
Yang Gao
LM&Ro
73
1
0
11 Mar 2025
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models
Jialv Zou
Bencheng Liao
Qian Zhang
Wenyu Liu
Xinggang Wang
Mamba
MLLM
82
1
0
11 Mar 2025
1LoRA: Summation Compression for Very Low-Rank Adaptation
Alessio Quercia
Zhuo Cao
Arya Bangun
Richard D. Paul
Abigail Morrison
Ira Assent
Hanno Scharr
55
0
0
11 Mar 2025
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
43
0
0
11 Mar 2025
FPGS: Feed-Forward Semantic-aware Photorealistic Style Transfer of Large-Scale Gaussian Splatting
GeonU Kim
Kim Youwang
Lee Hyoseok
Tae-Hyun Oh
3DGS
75
0
0
11 Mar 2025
Seal Your Backdoor with Variational Defense
Ivan Sabolić
Matej Grcić
Sinisa Segvic
AAML
109
0
0
11 Mar 2025
Pre-trained Models Succeed in Medical Imaging with Representation Similarity Degradation
Wenqiang Zu
Shenghao Xie
Hao Chen
Lei Ma
MedIm
42
0
0
11 Mar 2025
Twinner: Shining Light on Digital Twins in a Few Snaps
Jesus Zarzar
Tom Monnier
Roman Shapovalov
Andrea Vedaldi
David Novotny
45
0
0
11 Mar 2025
SignRep: Enhancing Self-Supervised Sign Representations
Ryan Wong
Necati Cihan Camgöz
Richard Bowden
SLR
53
0
0
11 Mar 2025
Toward Stable World Models: Measuring and Addressing World Instability in Generative Environments
Soonwoo Kwon
Jin-Young Kim
Hyojun Go
Kyungjune Baek
53
0
0
11 Mar 2025
MVGSR: Multi-View Consistency Gaussian Splatting for Robust Surface Reconstruction
Chenfeng Hou
Qi Xun Yeo
Mengqi Guo
Yongxin Su
Yanyan Li
G. Lee
3DGS
68
2
0
11 Mar 2025
PE3R: Perception-Efficient 3D Reconstruction
Jie Hu
Shizun Wang
Xinchao Wang
61
0
0
10 Mar 2025
Task-Specific Knowledge Distillation from the Vision Foundation Model for Enhanced Medical Image Segmentation
Pengchen Liang
Haishan Huang
Bin Pu
Jianguo Chen
Xiang Hua
Jing Zhang
Weibo Ma
Z. Chen
Yiwei Li
Qing Chang
43
0
0
10 Mar 2025
Visual and Text Prompt Segmentation: A Novel Multi-Model Framework for Remote Sensing
Xing Zi
Kairui Jin
Xian Tao
Jun Li
Ali Braytee
Rajiv Ratn Shah
Mukesh Prasad
VLM
62
0
0
10 Mar 2025
SIRE: SE(3) Intrinsic Rigidity Embeddings
Cameron Smith
Basile Van Hoorick
Vitor Campagnolo Guizilini
Yue Wang
42
0
0
10 Mar 2025
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model
Lixue Gong
Xiaoxia Hou
Fanshi Li
Liang Li
Xiaochen Lian
...
Qi Zhang
Yuwei Zhang
Shijia Zhao
Jianchao Yang
Weilin Huang
DiffM
VLM
55
6
0
10 Mar 2025
Semi-Supervised Medical Image Segmentation via Knowledge Mining from Large Models
Yuchen Mao
Hongwei Bran Li
Yinyi Lai
G. Papanastasiou
Peng Qi
Yunjie Yang
Chengjia Wang
VLM
39
1
0
10 Mar 2025
Alligat0R: Pre-Training Through Co-Visibility Segmentation for Relative Camera Pose Regression
Thibaut Loiseau
Guillaume Bourmaud
Vincent Lepetit
62
0
0
10 Mar 2025
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen
Bingchen Zhao
Yilun Chen
Jiangmiao Pang
Xiaojuan Qi
LM&Ro
39
0
0
10 Mar 2025
Endo-FASt3r: Endoscopic Foundation model Adaptation for Structure from motion
Mona Sheikh Zeinoddin
Mobarakol Islam
Zafer Tandogdu
Greg Shaw
Mathew J. Clarkson
E. Mazomenos
Danail Stoyanov
103
0
0
10 Mar 2025
Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization
Mihcael Green
Matan Levy
Issar Tzachor
Dvir Samuel
N. Darshan
Rami Ben-Ari
54
0
0
10 Mar 2025
Keeping Representation Similarity in Finetuning for Medical Image Analysis
Wenqiang Zu
Shenghao Xie
Hao Chen
Yiming Liang
Lei Ma
MedIm
OOD
43
0
0
10 Mar 2025
VORTEX: Challenging CNNs at Texture Recognition by using Vision Transformers with Orderless and Randomized Token Encodings
Leonardo F. S. Scabini
Kallil M. C. Zielinski
Emir Konuk
Ricardo T. Fares
L. C. Ribas
Kevin Smith
Odemir M. Bruno
ViT
44
0
0
09 Mar 2025
Previous
1
2
3
...
8
9
10
...
42
43
44
Next