Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2309.16588
Cited By
Vision Transformers Need Registers
28 September 2023
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Vision Transformers Need Registers"
39 / 239 papers shown
Title
CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt Tuning
Ziyang Gong
Fuhao Li
Yupeng Deng
Deblina Bhattacharjee
Xianzheng Ma
Xiangwei Zhu
Zhenming Ji
41
2
0
26 Mar 2024
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
S. A. Baumann
Felix Krause
Michael Neumayr
Nick Stracke
Vincent Tao Hu
Bjorn Ommer
Björn Ommer
DiffM
LM&Ro
46
11
0
25 Mar 2024
Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Mu Hu
Wei Yin
C. Zhang
Zhipeng Cai
Xiaoxiao Long
Kaixuan Wang
Kaixuan Wang
Gang Yu
Chunhua Shen
Shaojie Shen
3DGS
42
110
0
22 Mar 2024
LUWA Dataset: Learning Lithic Use-Wear Analysis on Microscopic Images
Jing Zhang
Irving Fang
Juexiao Zhang
Hao Wu
Akshat Kaushik
Alice Rodriguez
Hanwen Zhao
Zhuo Zheng
Radu Iovita
Chen Feng
14
3
0
19 Mar 2024
TTT-KD: Test-Time Training for 3D Semantic Segmentation through Knowledge Distillation from Foundation Models
Lisa Weijler
Muhammad Jehanzeb Mirza
Leon Sick
Can Ekkazan
Pedro Hermosilla
TTA
28
0
0
18 Mar 2024
Conditional computation in neural networks: principles and research trends
Simone Scardapane
Alessandro Baiocchi
Alessio Devoto
V. Marsocci
Pasquale Minervini
Jary Pomponi
27
0
0
12 Mar 2024
ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes
H. Malik
Muhammad Huzaifa
Muzammal Naseer
Salman Khan
Fahad Shahbaz Khan
DiffM
37
2
0
07 Mar 2024
ComFe: An Interpretable Head for Vision Transformers
Evelyn J. Mannix
H. Bondell
Howard Bondell
VLM
ViT
14
1
0
07 Mar 2024
HyenaPixel: Global Image Context with Convolutions
Julian Spravil
Sebastian Houben
Sven Behnke
16
1
0
29 Feb 2024
Massive Activations in Large Language Models
Mingjie Sun
Xinlei Chen
J. Zico Kolter
Zhuang Liu
57
64
0
27 Feb 2024
GROUNDHOG: Grounding Large Language Models to Holistic Segmentation
Yichi Zhang
Ziqiao Ma
Xiaofeng Gao
Suhaila Shakiah
Qiaozi Gao
Joyce Chai
MLLM
VLM
24
38
0
26 Feb 2024
General Purpose Image Encoder DINOv2 for Medical Image Registration
Xin Song
Xuanang Xu
Pingkun Yan
MedIm
25
5
0
24 Feb 2024
Attention-aware Semantic Communications for Collaborative Inference
Jiwoong Im
Nayoung Kwon
Taewoo Park
Jiheon Woo
Jaeho Lee
Yongjune Kim
18
2
0
23 Feb 2024
Convincing Rationales for Visual Question Answering Reasoning
Kun Li
G. Vosselman
Michael Ying Yang
22
1
0
06 Feb 2024
FindingEmo: An Image Dataset for Emotion Recognition in the Wild
Laurent Mertens
E. Yargholi
H. O. D. Beeck
Jan Van den Stock
Joost Vennekens
VLM
20
4
0
02 Feb 2024
Understanding Video Transformers via Universal Concept Discovery
M. Kowal
Achal Dave
Rares Ambrus
Adrien Gaidon
Konstantinos G. Derpanis
P. Tokmakov
ViT
24
2
0
19 Jan 2024
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation
Mathis Petrovich
Or Litany
Umar Iqbal
Michael J. Black
Gül Varol
Xue Bin Peng
Davis Rempe
DiffM
VGen
24
40
0
16 Jan 2024
RudolfV: A Foundation Model by Pathologists for Pathologists
Jonas Dippel
Barbara Feulner
Tobias Winterhoff
Timo Milbich
Stephan Tietz
...
David Horst
Lukas Ruff
Klaus-Robert Muller
Frederick Klauschen
Maximilian Alber
18
28
0
08 Jan 2024
Analyzing Local Representations of Self-supervised Vision Transformers
Ani Vanyan
Alvard Barseghyan
Hakob Tamazyan
Vahan Huroyan
Hrant Khachatrian
Martin Danelljan
25
2
0
31 Dec 2023
SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation
Yuxuan Zhang
Yiren Song
Jiaming Liu
Rui Wang
Jinpeng Yu
...
Huaxia Li
Xu Tang
Yao Hu
Han Pan
Zhongliang Jing
8
58
0
26 Dec 2023
CLIP-DINOiser: Teaching CLIP a few DINO tricks for open-vocabulary semantic segmentation
Monika Wysoczañska
Oriane Siméoni
Michael Ramamonjisoa
Andrei Bursuc
Tomasz Trzciñski
Patrick Pérez
VLM
CLIP
19
29
0
19 Dec 2023
Open Vocabulary Semantic Scene Sketch Understanding
Ahmed Bourouis
Judith E. Fan
Yulia Gryaditskaya
VLM
3DV
8
0
0
18 Dec 2023
Diffusion Illusions: Hiding Images in Plain Sight
R. Burgert
Xiang Li
Abe Leite
Kanchana Ranasinghe
Michael S. Ryoo
35
8
0
06 Dec 2023
Class-Discriminative Attention Maps for Vision Transformers
L. Brocki
Jakub Binda
N. C. Chung
MedIm
14
3
0
04 Dec 2023
FoundPose: Unseen Object Pose Estimation with Foundation Features
Evin Pınar Örnek
Yann Labbé
Bugra Tekin
Lingni Ma
Cem Keskin
Christian Forster
Tomás Hodan
10
38
0
30 Nov 2023
Emergent Open-Vocabulary Semantic Segmentation from Off-the-shelf Vision-Language Models
Jiayun Luo
Siddhesh Khandelwal
Leonid Sigal
Boyang Albert Li
MLLM
VLM
22
7
0
28 Nov 2023
Frozen Transformers in Language Models Are Effective Visual Encoder Layers
Ziqi Pang
Ziyang Xie
Yunze Man
Yu-xiong Wang
38
12
0
19 Oct 2023
Guiding Language Model Math Reasoning with Planning Tokens
Xinyi Wang
Lucas Page-Caccia
O. Ostapenko
Xingdi Yuan
William Yang Wang
Alessandro Sordoni
LRM
26
2
0
09 Oct 2023
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal
Ziwei Ji
A. S. Rawat
A. Menon
Sanjiv Kumar
Vaishnavh Nagarajan
LRM
10
92
0
03 Oct 2023
Dynamic Attention-Guided Diffusion for Image Super-Resolution
Brian B. Moser
Stanislav Frolov
Federico Raue
Sebastián M. Palacio
Andreas Dengel
DiffM
9
3
0
15 Aug 2023
PerceptionCLIP: Visual Classification by Inferring and Conditioning on Contexts
Bang An
Sicheng Zhu
Michael-Andrei Panaitescu-Liess
Chaithanya Kumar Mummadi
Furong Huang
VLM
12
7
0
02 Aug 2023
CoTracker: It is Better to Track Together
Nikita Karaev
Ignacio Rocco
Benjamin Graham
Natalia Neverova
Andrea Vedaldi
Christian Rupprecht
VOT
ViT
25
243
0
14 Jul 2023
OpenVIS: Open-vocabulary Video Instance Segmentation
Pinxue Guo
Tony Huang
Peiyang He
Xuefeng Liu
Tianjun Xiao
Zhaoyu Chen
Wenqiang Zhang
VLM
25
16
0
26 May 2023
Training-Free Acceleration of ViTs with Delayed Spatial Merging
J. Heo
Seyedarmin Azizi
A. Fayyazi
Massoud Pedram
20
3
0
04 Mar 2023
Masked Autoencoders Are Scalable Vision Learners
Kaiming He
Xinlei Chen
Saining Xie
Yanghao Li
Piotr Dollár
Ross B. Girshick
ViT
TPM
255
7,337
0
11 Nov 2021
ViDT: An Efficient and Effective Fully Transformer-based Object Detector
Hwanjun Song
Deqing Sun
Sanghyuk Chun
Varun Jampani
Dongyoon Han
Byeongho Heo
Wonjae Kim
Ming-Hsuan Yang
78
75
0
08 Oct 2021
Localizing Objects with Self-Supervised Transformers and no Labels
Oriane Siméoni
Gilles Puy
Huy V. Vo
Simon Roburin
Spyros Gidaris
Andrei Bursuc
P. Pérez
Renaud Marlet
Jean Ponce
ViT
159
195
0
29 Sep 2021
Emerging Properties in Self-Supervised Vision Transformers
Mathilde Caron
Hugo Touvron
Ishan Misra
Hervé Jégou
Julien Mairal
Piotr Bojanowski
Armand Joulin
283
5,723
0
29 Apr 2021
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky
Jia Deng
Hao Su
J. Krause
S. Satheesh
...
A. Karpathy
A. Khosla
Michael S. Bernstein
Alexander C. Berg
Li Fei-Fei
VLM
ObjD
279
39,083
0
01 Sep 2014
Previous
1
2
3
4
5