ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.16588
  4. Cited By
Vision Transformers Need Registers

Vision Transformers Need Registers

28 September 2023
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
    ViT
ArXivPDFHTML

Papers citing "Vision Transformers Need Registers"

50 / 239 papers shown
Title
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Z. Qiu
Z. Wang
Bo Zheng
Zeyu Huang
Kaiyue Wen
...
Fei Huang
Suozhi Huang
Dayiheng Liu
Jingren Zhou
Junyang Lin
MoE
11
0
0
10 May 2025
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Register and CLS tokens yield a decoupling of local and global features in large ViTs
Alexander Lappe
M. Giese
12
0
0
09 May 2025
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception
Junjie Wang
Bin Chen
Yulin Li
Bin Kang
Y. Chen
Zhuotao Tian
VLM
31
0
0
07 May 2025
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment
Edson Araujo
Andrew Rouditchenko
Yuan Gong
Saurabhchand Bhati
Samuel Thomas
Brian Kingsbury
Leonid Karlinsky
Rogerio Feris
James Glass
27
0
0
02 May 2025
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Transferable Adversarial Attacks on Black-Box Vision-Language Models
Kai Hu
Weichen Yu
L. Zhang
Alexander Robey
Andy Zou
Chengming Xu
Haoqi Hu
Matt Fredrikson
AAML
VLM
47
0
0
02 May 2025
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
Muyi Bao
Shuchang Lyu
Zhaoyang Xu
Huiyu Zhou
Jinchang Ren
Shiming Xiang
X. Li
Guangliang Cheng
Mamba
72
0
0
01 May 2025
Learning Multi-view Multi-class Anomaly Detection
Learning Multi-view Multi-class Anomaly Detection
Qianzi Yu
Yang Cao
Yu Kang
43
0
0
30 Apr 2025
An Empirical Study on Prompt Compression for Large Language Models
An Empirical Study on Prompt Compression for Large Language Models
Z. Zhang
Jinyi Li
Yihuai Lan
X. Wang
Hao Wang
MQ
37
0
0
24 Apr 2025
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars
Rei Higuchi
Ryotaro Kawata
Naoki Nishikawa
Kazusato Oko
Shoichiro Yamaguchi
Sosuke Kobayashi
Seiya Tokui
K. Hayashi
Daisuke Okanohara
Taiji Suzuki
AI4CE
30
0
0
24 Apr 2025
PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition
PhysioSync: Temporal and Cross-Modal Contrastive Learning Inspired by Physiological Synchronization for EEG-Based Emotion Recognition
Kai Cui
J. Li
Y. Liu
Xuesong Zhang
Zhenzhen Hu
M. Wang
32
0
0
24 Apr 2025
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Boosting Generative Image Modeling via Joint Image-Feature Synthesis
Theodoros Kouzelis
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
DiffM
24
0
0
22 Apr 2025
IXGS-Intraoperative 3D Reconstruction from Sparse, Arbitrarily Posed Real X-rays
IXGS-Intraoperative 3D Reconstruction from Sparse, Arbitrarily Posed Real X-rays
Sascha Jecklin
Aidana Massalimova
Ruyi Zha
Lilian Calvet
C. Laux
Mazda Farshad
Philipp Fürnstahl
19
0
0
20 Apr 2025
Perception Encoder: The best visual embeddings are not at the output of the network
Perception Encoder: The best visual embeddings are not at the output of the network
Daniel Bolya
Po-Yao (Bernie) Huang
Peize Sun
Jang Hyun Cho
Andrea Madotto
...
Shiyu Dong
Nikhila Ravi
Daniel Li
Piotr Dollár
Christoph Feichtenhofer
ObjD
VOS
98
0
0
17 Apr 2025
Search is All You Need for Few-shot Anomaly Detection
Search is All You Need for Few-shot Anomaly Detection
Qishan Wang
Jia Guo
Shuyong Gao
H. Wang
Li Xiong
J. Hu
Hanqi Guo
Wenqiang Zhang
53
0
0
16 Apr 2025
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation
Yongchao Feng
Yajie Liu
Shuai Yang
Wenrui Cai
J. Zhang
...
Jiahui Lv
Z. Liu
Tengyuan Shi
Qingjie Liu
Y. Wang
MLLM
VLM
47
1
0
13 Apr 2025
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
ViT
22
0
0
11 Apr 2025
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
MARS: a Multimodal Alignment and Ranking System for Few-Shot Segmentation
Nico Catalano
Stefano Samele
Paolo Pertino
Matteo Matteucci
3DPC
40
0
0
10 Apr 2025
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Latent Diffusion U-Net Representations Contain Positional Embeddings and Anomalies
Jonas Loos
Lorenz Linhardt
21
0
0
09 Apr 2025
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception
Yuankun Xie
Ruibo Fu
Z. Wang
Xiaopeng Wang
Songjun Cao
Long Ma
Haonan Cheng
Long Ye
18
0
0
09 Apr 2025
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation
Earth-Adapter: Bridge the Geospatial Domain Gaps with Mixture of Frequency Adaptation
Xiaoxing Hu
Ziyang Gong
Y. Wang
Yuru Jia
Gen Luo
Xue Yang
26
0
0
08 Apr 2025
Accurate Ab-initio Neural-network Solutions to Large-Scale Electronic Structure Problems
Accurate Ab-initio Neural-network Solutions to Large-Scale Electronic Structure Problems
Michael Scherbela
Nicholas Gao
Philipp Grohs
Stephan Günnemann
21
0
0
08 Apr 2025
Conditioning Diffusions Using Malliavin Calculus
Conditioning Diffusions Using Malliavin Calculus
Jakiw Pidstrigach
Elizabeth Baker
Carles Domingo-Enrich
George Deligiannidis
Nikolas Nüsken
DiffM
30
0
0
04 Apr 2025
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Refining CLIP's Spatial Awareness: A Visual-Centric Perspective
Congpei Qiu
Yanhao Wu
Wei Ke
Xiuxiu Bai
Tong Zhang
VLM
44
0
0
03 Apr 2025
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Unified World Models: Coupling Video and Action Diffusion for Pretraining on Large Robotic Datasets
Chuning Zhu
Raymond Yu
S. Feng
Benjamin Burchfiel
Paarth Shah
Abhishek Gupta
VGen
50
0
0
03 Apr 2025
Q-Adapt: Adapting LMM for Visual Quality Assessment with Progressive Instruction Tuning
Q-Adapt: Adapting LMM for Visual Quality Assessment with Progressive Instruction Tuning
Yiting Lu
X. Li
H. Wu
Bingchen Li
Weisi Lin
Zhibo Chen
37
1
0
02 Apr 2025
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval
Prompt-Guided Attention Head Selection for Focus-Oriented Image Retrieval
Yuji Nozawa
Yu Lin
Kazumoto Nakamura
Youyang Ng
38
0
0
02 Apr 2025
Leveraging Diffusion Model and Image Foundation Model for Improved Correspondence Matching in Coronary Angiography
Leveraging Diffusion Model and Image Foundation Model for Improved Correspondence Matching in Coronary Angiography
Lin Zhao
Xin Yu
Yikang Liu
Xiao Chen
Eric Z. Chen
Terrence Chen
Shanhui Sun
DiffM
MedIm
37
0
0
31 Mar 2025
LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
LATex: Leveraging Attribute-based Text Knowledge for Aerial-Ground Person Re-Identification
Xiang Hu
Yuhao Wang
Pingping Zhang
Huchuan Lu
VLM
37
0
0
31 Mar 2025
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
ViT-Linearizer: Distilling Quadratic Knowledge into Linear-Time Vision Models
Guoyizhe Wei
Rama Chellappa
31
0
0
30 Mar 2025
Endo-TTAP: Robust Endoscopic Tissue Tracking via Multi-Facet Guided Attention and Hybrid Flow-point Supervision
Endo-TTAP: Robust Endoscopic Tissue Tracking via Multi-Facet Guided Attention and Hybrid Flow-point Supervision
Rulin Zhou
Wenlong He
An Wang
Qiqi Yao
Haijun Hu
Jiankun Wang
Xi Zhang an Hongliang Ren
29
0
0
28 Mar 2025
Beyond Intermediate States: Explaining Visual Redundancy through Language
Beyond Intermediate States: Explaining Visual Redundancy through Language
Dingchen Yang
Bowen Cao
Anran Zhang
Weibo Gu
Winston Hu
Guang Chen
VLM
76
0
0
26 Mar 2025
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning
Chau Pham
Juan C. Caicedo
Bryan A. Plummer
37
0
0
25 Mar 2025
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
Learning 3D Object Spatial Relationships from Pre-trained 2D Diffusion Models
Sangwon Beak
Hyeonwoo Kim
Hanbyul Joo
36
0
0
25 Mar 2025
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation
Vladan Stojnić
Yannis Kalantidis
Jirí Matas
Giorgos Tolias
VLM
41
0
0
25 Mar 2025
Your ViT is Secretly an Image Segmentation Model
Your ViT is Secretly an Image Segmentation Model
Tommie Kerssies
Niccolò Cavagnero
Alexander Hermans
Narges Norouzi
Giuseppe Averta
Bastian Leibe
Gijs Dubbelman
Daan de Geus
ViT
VLM
51
1
0
24 Mar 2025
Rethinking Glaucoma Calibration: Voting-Based Binocular and Metadata Integration
Rethinking Glaucoma Calibration: Voting-Based Binocular and Metadata Integration
Taejin Jeong
Joohyeok Kim
Jaehoon Joo
Yeonwoo Jung
Hyeonmin Kim
Seong Jae Hwang
48
0
0
24 Mar 2025
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Beyond Accuracy: What Matters in Designing Well-Behaved Models?
Robin Hesse
Doğukan Bağcı
Bernt Schiele
Simone Schaub-Meyer
Stefan Roth
VLM
49
0
0
21 Mar 2025
Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes
Exploring Few-Shot Object Detection on Blood Smear Images: A Case Study of Leukocytes and Schistocytes
Davide Antonio Mura
Michela Pinna
Lorenzo Putzu
A. Loddo
Alessandra Perniciano
Olga Mulas
Cecilia Di Ruberto
37
0
0
21 Mar 2025
Learning 3D Scene Analogies with Neural Contextual Scene Maps
Learning 3D Scene Analogies with Neural Contextual Scene Maps
Junho Kim
Gwangtak Bae
E. Lee
Young Min Kim
3DPC
3DV
55
0
0
20 Mar 2025
RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment
RFMI: Estimating Mutual Information on Rectified Flow for Text-to-Image Alignment
Chao Wang
Giulio Franzese
A. Finamore
Pietro Michiardi
58
0
0
18 Mar 2025
An interpretable approach to automating the assessment of biofouling in video footage
An interpretable approach to automating the assessment of biofouling in video footage
Evelyn J. Mannix
Bartholomew A. Woodham
48
0
0
17 Mar 2025
COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
COIN: Confidence Score-Guided Distillation for Annotation-Free Cell Segmentation
Sanghyun Jo
Seo Jin Lee
Seungwoo Lee
Seohyung Hong
Hyungseok Seo
Kyungsu Kim
41
0
0
14 Mar 2025
VGGT: Visual Geometry Grounded Transformer
Jianyuan Wang
Minghao Chen
Nikita Karaev
Andrea Vedaldi
Christian Rupprecht
David Novotny
ViT
46
6
0
14 Mar 2025
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Beyond Atoms: Enhancing Molecular Pretrained Representations with 3D Space Modeling
Shuqi Lu
Xiaohong Ji
Bohang Zhang
Lin Yao
Siyuan Liu
Zhifeng Gao
Linfeng Zhang
Guolin Ke
AI4CE
33
1
0
13 Mar 2025
Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding
Shunqi Mao
Chaoyi Zhang
Weidong Cai
MLLM
48
0
0
13 Mar 2025
Studying Classifier(-Free) Guidance From a Classifier-Centric Perspective
Xiaoming Zhao
Alexander Schwing
FaML
58
0
0
13 Mar 2025
The Power of One: A Single Example is All it Takes for Segmentation in VLMs
Mir Rayat Imtiaz Hossain
Mennatullah Siam
Leonid Sigal
James J. Little
MLLM
VLM
61
0
0
13 Mar 2025
Robustness Tokens: Towards Adversarial Robustness of Transformers
Brian Pulfer
Yury Belousov
S. Voloshynovskiy
AAML
32
0
0
13 Mar 2025
Evaluating Visual Explanations of Attention Maps for Transformer-based Medical Imaging
Minjae Chung
Jong Bum Won
Ganghyun Kim
Yujin Kim
Utku Ozbulak
MedIm
47
0
0
12 Mar 2025
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
36
0
0
11 Mar 2025
12345
Next