ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.16588
  4. Cited By
Vision Transformers Need Registers

Vision Transformers Need Registers

28 September 2023
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
    ViT
ArXivPDFHTML

Papers citing "Vision Transformers Need Registers"

50 / 239 papers shown
Title
TIPS: Text-Image Pretraining with Spatial awareness
TIPS: Text-Image Pretraining with Spatial awareness
Kevis-Kokitsi Maninis
Kaifeng Chen
Soham Ghosh
Arjun Karpur
Koert Chen
...
Jan Dlabal
Dan Gnanapragasam
Mojtaba Seyedhosseini
Howard Zhou
Andre Araujo
VLM
23
3
0
21 Oct 2024
LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes
LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes
Juliette Marrie
Romain Menegaux
Michael Arbel
Diane Larlus
Julien Mairal
3DGS
37
1
0
18 Oct 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
34
1
0
18 Oct 2024
Efficient Vision-Language Models by Summarizing Visual Tokens into
  Compact Registers
Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers
Yuxin Wen
Qingqing Cao
Qichen Fu
Sachin Mehta
Mahyar Najibi
VLM
20
4
0
17 Oct 2024
Active-Dormant Attention Heads: Mechanistically Demystifying
  Extreme-Token Phenomena in LLMs
Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs
Tianyu Guo
Druv Pai
Yu Bai
Jiantao Jiao
Michael I. Jordan
Song Mei
13
9
0
17 Oct 2024
Towards Zero-Shot Camera Trap Image Categorization
Towards Zero-Shot Camera Trap Image Categorization
Jiří Vyskočil
Lukas Picek
VLM
18
0
0
16 Oct 2024
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video
  Imitation
OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation
Jinhan Li
Yifeng Zhu
Yuqi Xie
Zhenyu Jiang
Mingyo Seo
Georgios Pavlakos
Yuke Zhu
LM&Ro
23
31
0
15 Oct 2024
ControlMM: Controllable Masked Motion Generation
ControlMM: Controllable Masked Motion Generation
Ekkasit Pinyoanuntapong
Muhammad Usama Saleem
Korrawe Karunratanakul
Pu Wang
Hongfei Xue
C. L. P. Chen
Chuan Guo
Junli Cao
J. Ren
Sergey Tulyakov
VGen
24
4
0
14 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Y. Zou
Tatsunori Hashimoto
VLM
56
3
0
14 Oct 2024
Emerging Pixel Grounding in Large Multimodal Models Without Grounding
  Supervision
Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision
Shengcao Cao
Liang-Yan Gui
Yu-Xiong Wang
32
3
0
10 Oct 2024
Towards Interpreting Visual Information Processing in Vision-Language Models
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo
Luke Ong
Philip H. S. Torr
Mor Geva
David M. Krueger
Fazl Barez
78
6
0
09 Oct 2024
Round and Round We Go! What makes Rotary Positional Encodings useful?
Round and Round We Go! What makes Rotary Positional Encodings useful?
Federico Barbero
Alex Vitvitskyi
Christos Perivolaropoulos
Razvan Pascanu
Petar Velickovic
52
16
0
08 Oct 2024
Brain Mapping with Dense Features: Grounding Cortical Semantic
  Selectivity in Natural Images With Vision Transformers
Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers
Andrew F. Luo
Jacob Yeung
Rushikesh Zawar
Shaurya Dewan
Margaret M. Henderson
Leila Wehbe
Michael J. Tarr
21
3
0
07 Oct 2024
LoTLIP: Improving Language-Image Pre-training for Long Text
  Understanding
LoTLIP: Improving Language-Image Pre-training for Long Text Understanding
Wei Wu
Kecheng Zheng
Shuailei Ma
Fan Lu
Yuxin Guo
Yifei Zhang
Wei Chen
Qingpei Guo
Yujun Shen
Zheng-Jun Zha
VLM
14
0
0
07 Oct 2024
Improving Image Clustering with Artifacts Attenuation via Inference-Time
  Attention Engineering
Improving Image Clustering with Artifacts Attenuation via Inference-Time Attention Engineering
Kazumoto Nakamura
Yuji Nozawa
Yu-Chieh Lin
K. Nakata
Youyang Ng
ViT
25
0
0
07 Oct 2024
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
T-JEPA: Augmentation-Free Self-Supervised Learning for Tabular Data
Hugo Thimonier
José Lucas De Melo Costa
Fabrice Popineau
Arpad Rimmel
Bich-Liên Doan
42
1
0
07 Oct 2024
Block Vecchia Approximation for Scalable and Efficient Gaussian Process
  Computations
Block Vecchia Approximation for Scalable and Efficient Gaussian Process Computations
Qilong Pan
Sameh Abdulah
M. Genton
Ying Sun
16
1
0
06 Oct 2024
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
SyllableLM: Learning Coarse Semantic Units for Speech Language Models
Alan Baade
Puyuan Peng
David F. Harwath
37
3
0
05 Oct 2024
Attention layers provably solve single-location regression
Attention layers provably solve single-location regression
P. Marion
Raphael Berthier
Gérard Biau
Claire Boyer
38
2
0
02 Oct 2024
House of Cards: Massive Weights in LLMs
House of Cards: Massive Weights in LLMs
Jaehoon Oh
Seungjun Shin
Dokwan Oh
35
1
0
02 Oct 2024
softmax is not enough (for sharp out-of-distribution)
softmax is not enough (for sharp out-of-distribution)
Petar Veličković
Christos Perivolaropoulos
Federico Barbero
Razvan Pascanu
29
17
0
01 Oct 2024
Task Success Prediction for Open-Vocabulary Manipulation Based on
  Multi-Level Aligned Representations
Task Success Prediction for Open-Vocabulary Manipulation Based on Multi-Level Aligned Representations
Miyu Goko
Motonari Kambara
Daichi Saito
Seitaro Otsuki
Komei Sugiura
17
2
0
01 Oct 2024
Multi-Atlas Brain Network Classification through Consistency
  Distillation and Complementary Information Fusion
Multi-Atlas Brain Network Classification through Consistency Distillation and Complementary Information Fusion
Jiaxing Xu
Mengcheng Lan
Xia Dong
Kai He
Wei Zhang
Qingtian Bian
Yiping Ke
23
3
0
28 Sep 2024
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
ProMerge: Prompt and Merge for Unsupervised Instance Segmentation
Dylan Li
Gyungin Shin
12
3
0
27 Sep 2024
Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography
Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography
Yuexi Du
John Onofrey
Nicha Dvornek
VLM
25
0
0
26 Sep 2024
Attention Prompting on Image for Large Vision-Language Models
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
17
5
0
25 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
V. Papyan
VLM
36
1
0
20 Sep 2024
Is Tokenization Needed for Masked Particle Modelling?
Is Tokenization Needed for Masked Particle Modelling?
Matthew Leigh
Samuel Klein
François Charton
Tobias Golling
Lukas Heinrich
Michael Kagan
Ines Ochoa
Margarita Osadchy
17
4
0
19 Sep 2024
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation
  for Synthetic Image Detection
Detect Fake with Fake: Leveraging Synthetic Data-driven Representation for Synthetic Image Detection
Hina Otake
Yoshihiro Fukuhara
Yoshiki Kubotani
Shigeo Morishima
ViT
40
0
0
13 Sep 2024
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for
  Robotic Manipulation
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
Wenlong Huang
Chen Wang
Y. Li
Ruohan Zhang
Li Fei-Fei
37
81
0
03 Sep 2024
Can Transformers Do Enumerative Geometry?
Can Transformers Do Enumerative Geometry?
Baran Hashemi
Roderic G. Corominas
Alessandro Giacchetto
32
2
0
27 Aug 2024
Dual-Path Adversarial Lifting for Domain Shift Correction in Online
  Test-time Adaptation
Dual-Path Adversarial Lifting for Domain Shift Correction in Online Test-time Adaptation
Yushun Tang
Shuoshuo Chen
Zhihe Lu
Xinchao Wang
Zhihai He
21
1
0
26 Aug 2024
Can Visual Foundation Models Achieve Long-term Point Tracking?
Can Visual Foundation Models Achieve Long-term Point Tracking?
Görkay Aydemir
Weidi Xie
Fatma Guney
19
7
0
24 Aug 2024
KonvLiNA: Integrating Kolmogorov-Arnold Network with Linear Nyström
  Attention for feature fusion in Crop Field Detection
KonvLiNA: Integrating Kolmogorov-Arnold Network with Linear Nyström Attention for feature fusion in Crop Field Detection
Haruna Yunusa
Qin Shiyin
A. Lawan
Abdulrahman Hamman Adama Chukkol
26
1
0
23 Aug 2024
Building and better understanding vision-language models: insights and
  future directions
Building and better understanding vision-language models: insights and future directions
Hugo Laurençon
Andrés Marafioti
Victor Sanh
Léo Tronchon
VLM
29
60
0
22 Aug 2024
HiRED: Attention-Guided Token Dropping for Efficient Inference of
  High-Resolution Vision-Language Models in Resource-Constrained Environments
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments
Kazi Hasan Ibn Arif
JinYi Yoon
Dimitrios S. Nikolopoulos
Hans Vandierendonck
Deepu John
Bo Ji
MLLM
VLM
27
14
0
20 Aug 2024
MePT: Multi-Representation Guided Prompt Tuning for Vision-Language
  Model
MePT: Multi-Representation Guided Prompt Tuning for Vision-Language Model
Xinyang Wang
Yi Yang
Minfeng Zhu
Kecheng Zheng
Shi Liu
Wei Chen
VPVLM
MLLM
VLM
28
1
0
19 Aug 2024
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models
Lin Zhao
Xiao Chen
Eric Z. Chen
Yikang Liu
Terrence Chen
Shanhui Sun
VLM
44
5
0
16 Aug 2024
Unsupervised Part Discovery via Dual Representation Alignment
Unsupervised Part Discovery via Dual Representation Alignment
Jiahao Xia
Wenjian Huang
Min Xu
Jianguo Zhang
Haimin Zhang
Ziyu Sheng
Dong Xu
21
0
0
15 Aug 2024
Human-inspired Explanations for Vision Transformers and Convolutional
  Neural Networks
Human-inspired Explanations for Vision Transformers and Convolutional Neural Networks
Mahadev Prasad Panda
Matteo Tiezzi
Martina Vilas
Gemma Roig
Bjoern M. Eskofier
Dario Zanca
ViT
AAML
16
1
0
04 Aug 2024
Virchow2: Scaling Self-Supervised Mixed Magnification Models in
  Pathology
Virchow2: Scaling Self-Supervised Mixed Magnification Models in Pathology
Eric Zimmermann
Eugene Vorontsov
Julian Viret
Adam Casson
Michal Zelechowski
...
Razik Yousfi
Thomas J. Fuchs
Nicolò Fusi
Siqi Liu
Kristen Severson
MedIm
16
25
0
01 Aug 2024
Paying More Attention to Image: A Training-Free Method for Alleviating
  Hallucination in LVLMs
Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs
Shiping Liu
Kecheng Zheng
Wei Chen
MLLM
33
33
0
31 Jul 2024
EUDA: An Efficient Unsupervised Domain Adaptation via Self-Supervised
  Vision Transformer
EUDA: An Efficient Unsupervised Domain Adaptation via Self-Supervised Vision Transformer
Ali Abedi
Q. M. Jonathan Wu
Ning Zhang
Farhad Pourpanah
21
2
0
31 Jul 2024
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Improving 2D Feature Representations by 3D-Aware Fine-Tuning
Yuanwen Yue
Anurag Das
Francis Engelmann
Siyu Tang
J. E. Lenssen
25
23
0
29 Jul 2024
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Theia: Distilling Diverse Vision Foundation Models for Robot Learning
Jinghuan Shang
Karl Schmeckpeper
Brandon B. May
M. Minniti
Tarik Kelestemur
David Watkins
Laura Herlant
VLM
24
23
0
29 Jul 2024
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Mixture of Nested Experts: Adaptive Processing of Visual Tokens
Gagan Jain
Nidhi Hegde
Aditya Kusupati
Arsha Nagrani
Shyamal Buch
Prateek Jain
Anurag Arnab
Sujoy Paul
MoE
25
7
0
29 Jul 2024
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Unsqueeze [CLS] Bottleneck to Learn Rich Representations
Qing Su
Shihao Ji
19
0
0
24 Jul 2024
u-$\mu$P: The Unit-Scaled Maximal Update Parametrization
u-μ\muμP: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
46
9
0
24 Jul 2024
Pretrained Visual Representations in Reinforcement Learning
Pretrained Visual Representations in Reinforcement Learning
Emlyn Williams
Athanasios Polydoros
SSL
15
1
0
24 Jul 2024
SINDER: Repairing the Singular Defects of DINOv2
SINDER: Repairing the Singular Defects of DINOv2
Haoqian Wang
Tong Zhang
Mathieu Salzmann
16
0
0
23 Jul 2024
Previous
12345
Next