ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2309.16588
  4. Cited By
Vision Transformers Need Registers

Vision Transformers Need Registers

28 September 2023
Zilong Chen
Maxime Oquab
Julien Mairal
Huaping Liu
    ViT
ArXivPDFHTML

Papers citing "Vision Transformers Need Registers"

50 / 239 papers shown
Title
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models
Kwan Yun
Seokhyeon Hong
Chaelin Kim
Junyong Noh
DiffM
VGen
38
0
0
11 Mar 2025
Iterative Prompt Relocation for Distribution-Adaptive Visual Prompt Tuning
Chikai Shang
Mengke Li
Yiqun Zhang
Zhen Chen
Jinlin Wu
Fangqing Gu
Yang Lu
Yiu-ming Cheung
VLM
66
0
0
10 Mar 2025
Exploring Interpretability for Visual Prompt Tuning with Hierarchical Concepts
Yubin Wang
Xinyang Jiang
De Cheng
Xiangqian Zhao
Zilong Wang
Dongsheng Li
Cairong Zhao
VLM
59
0
0
08 Mar 2025
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang
Jinyeong Kim
Junhyeok Kim
Seong Jae Hwang
VLM
98
5
0
05 Mar 2025
Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection
Wei Luo
Yunkang Cao
Haiming Yao
Xiaotian Zhang
Jianan Lou
Y. Cheng
Weiming Shen
Wenyong Yu
55
1
0
04 Mar 2025
Primus: Enforcing Attention Usage for 3D Medical Image Segmentation
Tassilo Wald
Saikat Roy
Fabian Isensee
Constantin Ulrich
Sebastian Ziegler
D. Trofimova
Raphael Stock
Michael Baumgartner
Gregor Köhler
Klaus H. Maier-Hein
ViT
MedIm
37
1
0
03 Mar 2025
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz U. Abdullaev
Tan M. Nguyen
32
2
0
02 Mar 2025
Proteina: Scaling Flow-based Protein Structure Generative Models
Tomas Geffner
Kieran Didi
Zuobai Zhang
Danny Reidenbach
Zhonglin Cao
...
Mario Geiger
Christian Dallago
E. Küçükbenli
Arash Vahdat
Karsten Kreis
DiffM
AI4CE
33
4
0
02 Mar 2025
Tell me why: Visual foundation models as self-explainable classifiers
Tell me why: Visual foundation models as self-explainable classifiers
Hugues Turbé
Mina Bjelogrlic
G. Mengaldo
Christian Lovis
59
0
0
26 Feb 2025
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs
Jiarui Zhang
Mahyar Khayatkhoei
P. Chhikara
Filip Ilievski
LRM
34
5
0
24 Feb 2025
Simpler Fast Vision Transformers with a Jumbo CLS Token
Simpler Fast Vision Transformers with a Jumbo CLS Token
A. Fuller
Yousef Yassin
Daniel G. Kyrollos
Evan Shelhamer
James R. Green
64
0
0
24 Feb 2025
Few-shot Species Range Estimation
Few-shot Species Range Estimation
Christian Lange
Max Hamilton
Elijah Cole
Alexander Shepard
Samuel Heinrich
Angela Zhu
Subhransu Maji
Grant Van Horn
Oisin Mac Aodha
68
0
0
24 Feb 2025
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
UNION: Unsupervised 3D Object Detection using Object Appearance-based Pseudo-Classes
T. Lentsch
Holger Caesar
D. Gavrila
3DPC
65
7
0
20 Feb 2025
Object-Centric Image to Video Generation with Language Guidance
Object-Centric Image to Video Generation with Language Guidance
Angel Villar-Corrales
Gjergj Plepi
Sven Behnke
DiffM
VGen
OCL
63
0
0
17 Feb 2025
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment
Harrish Thasarathan
Julian Forsyth
Thomas Fel
M. Kowal
Konstantinos G. Derpanis
81
7
0
06 Feb 2025
Disentangling CLIP Features for Enhanced Localized Understanding
Disentangling CLIP Features for Enhanced Localized Understanding
Samyak Rawelekar
Yujun Cai
Yiwei Wang
Ming-Hsuan Yang
N. Ahuja
VLM
CoGe
65
0
0
05 Feb 2025
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
ZISVFM: Zero-Shot Object Instance Segmentation in Indoor Robotic Environments with Vision Foundation Models
Ying Zhang
Maoliang Yin
Wenfu Bi
Haibao Yan
Shaohan Bian
Cui-Hua Zhang
C. Hua
65
2
0
05 Feb 2025
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation
Adil Kaan Akan
Yucel Yemez
DiffM
OCL
37
0
0
27 Jan 2025
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Parallel Sequence Modeling via Generalized Spatial Propagation Network
Hongjun Wang
Wonmin Byeon
Jiarui Xu
Jinwei Gu
Ka Chun Cheung
Xiaolong Wang
Kai Han
Jan Kautz
Sifei Liu
44
0
0
21 Jan 2025
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation
MoRe: Class Patch Attention Needs Regularization for Weakly Supervised Semantic Segmentation
Zhiwei Yang
Yucong Meng
Kexue Fu
Shuo Wang
Zhijian Song
82
1
0
20 Jan 2025
A Bias-Free Training Paradigm for More General AI-generated Image Detection
A Bias-Free Training Paradigm for More General AI-generated Image Detection
Fabrizio Guillaro
Giada Zingarini
Ben Usman
Avneesh Sud
D. Cozzolino
L. Verdoliva
DiffM
54
3
0
23 Dec 2024
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level
  Vision-Language Alignment
DINOv2 Meets Text: A Unified Framework for Image- and Pixel-Level Vision-Language Alignment
Cijo Jose
Théo Moutakanni
Dahyun Kang
Federico Baldassarre
Timothée Darcet
...
Maxime Oquab
Oriane Siméoni
Huy V. Vo
Patrick Labatut
Piotr Bojanowski
CLIP
VLM
88
6
0
20 Dec 2024
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Towards scientific discovery with dictionary learning: Extracting biological concepts from microscopy foundation models
Konstantin Donhauser
Kristina Ulicna
Gemma Elyse Moran
Aditya Ravuri
Kian Kenyon-Dean
Cian Eastwood
Jason Hartford
76
0
0
20 Dec 2024
$\texttt{DINO-Foresight}$: Looking into the Future with DINO
DINO-Foresight\texttt{DINO-Foresight}DINO-Foresight: Looking into the Future with DINO
Efstathios Karypidis
Ioannis Kakogeorgiou
Spyros Gidaris
N. Komodakis
AI4CE
77
1
0
16 Dec 2024
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
AsymRnR: Video Diffusion Transformers Acceleration with Asymmetric Reduction and Restoration
Wenhao Sun
Rong-Cheng Tu
Jingyi Liao
Zhao Jin
Dacheng Tao
VGen
97
1
0
16 Dec 2024
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
Yue Chen
Xingyu Chen
Anpei Chen
Gerard Pons-Moll
Yuliang Xiu
3DGS
78
2
0
12 Dec 2024
Cross-View Completion Models are Zero-shot Correspondence Estimators
Cross-View Completion Models are Zero-shot Correspondence Estimators
Honggyu An
J. Kim
Seonghoon Park
Jaewoo Jung
Jisang Han
Sunghwan Hong
Seungryong Kim
3DV
73
3
0
12 Dec 2024
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
Beyond Text-Visual Attention: Exploiting Visual Cues for Effective Token Pruning in VLMs
Qizhe Zhang
Aosong Cheng
Ming Lu
Zhiyong Zhuo
Minqi Wang
Jiajun Cao
Shaobo Guo
Qi She
Shanghang Zhang
VLM
78
11
0
02 Dec 2024
Explaining the Impact of Training on Vision Models via Activation Clustering
Explaining the Impact of Training on Vision Models via Activation Clustering
Ahcène Boubekki
Samuel G. Fadel
Sebastian Mair
84
0
0
29 Nov 2024
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language
  for Open-Vocabulary Segmentation
Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation
Luca Barsellotti
Lorenzo Bianchi
Nicola Messina
F. Carrara
Marcella Cornia
Lorenzo Baraldi
Fabrizio Falchi
Rita Cucchiara
VLM
62
2
0
28 Nov 2024
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image
Xingyu Liu
Gu Wang
Ruida Zhang
Chenyangguang Zhang
F. Tombari
Xiangyang Ji
79
2
0
25 Nov 2024
ResCLIP: Residual Attention for Training-free Dense Vision-language
  Inference
ResCLIP: Residual Attention for Training-free Dense Vision-language Inference
Yuhang Yang
Jinhong Deng
Wen Li
Lixin Duan
VLM
66
0
0
24 Nov 2024
Medical Slice Transformer: Improved Diagnosis and Explainability on 3D
  Medical Images with DINOv2
Medical Slice Transformer: Improved Diagnosis and Explainability on 3D Medical Images with DINOv2
Gustav Muller-Franzes
Firas Khader
R. Siepmann
T. Han
Jakob Nikolas Kather
S. Nebelung
Daniel Truhn
MedIm
59
0
0
24 Nov 2024
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation
Sule Bai
Yong-Jin Liu
Yifei Han
Haoji Zhang
Yansong Tang
VLM
70
3
0
24 Nov 2024
Hymba: A Hybrid-head Architecture for Small Language Models
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong
Y. Fu
Shizhe Diao
Wonmin Byeon
Zijia Chen
...
Min-Hung Chen
Yoshi Suhara
Y. Lin
Jan Kautz
Pavlo Molchanov
Mamba
86
13
0
20 Nov 2024
Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting
Hongjun Wang
Jiyuan Chen
Lingyu Zhang
Renhe Jiang
Xuan Song
AI4TS
66
0
0
18 Nov 2024
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf
  Foundation Models for Open-Vocabulary Semantic Segmentation
CorrCLIP: Reconstructing Correlations in CLIP with Off-the-Shelf Foundation Models for Open-Vocabulary Semantic Segmentation
Dengke Zhang
Fagui Liu
Quan Tang
VLM
35
0
0
15 Nov 2024
Harnessing Vision Foundation Models for High-Performance, Training-Free
  Open Vocabulary Segmentation
Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation
Yuheng Shi
Minjing Dong
Chang Xu
VLM
19
1
0
14 Nov 2024
FlowTS: Time Series Generation via Rectified Flow
FlowTS: Time Series Generation via Rectified Flow
Yang Hu
X. Wang
Lirong Wu
H. M. Zhang
Stan Z. Li
Sheng Wang
Tianlong Chen
Jiheng Zhang
Ziyun Li
Tianlong Chen
AI4TS
16
0
0
12 Nov 2024
CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal
  Large Language Models
CapeLLM: Support-Free Category-Agnostic Pose Estimation with Multimodal Large Language Models
Junho Kim
Hyungjin Chung
Byung-Hoon Kim
VLM
26
0
0
11 Nov 2024
Moving Off-the-Grid: Scene-Grounded Video Representations
Moving Off-the-Grid: Scene-Grounded Video Representations
Sjoerd van Steenkiste
Daniel Zoran
Yi Yang
Yulia Rubanova
Rishabh Kabra
...
Thomas Keck
João Carreira
Alexey Dosovitskiy
Mehdi S. M. Sajjadi
Thomas Kipf
24
3
0
08 Nov 2024
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
MambaPEFT: Exploring Parameter-Efficient Fine-Tuning for Mamba
Masakazu Yoshimura
Teruaki Hayashi
Yota Maeda
Mamba
56
2
0
06 Nov 2024
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
On Improved Conditioning Mechanisms and Pre-training Strategies for Diffusion Models
Tariq Berrada Ifriqi
Pietro Astolfi
Melissa Hall
Reyhane Askari Hemmat
Yohann Benchetrit
...
Matthew Muckley
Karteek Alahari
Adriana Romero Soriano
Jakob Verbeek
M. Drozdzal
AI4CE
VLM
45
3
0
05 Nov 2024
Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective
Rethinking Decoders for Transformer-based Semantic Segmentation: A Compression Perspective
Qishuai Wen
Chun-Guang Li
ViT
27
0
0
05 Nov 2024
Sparsh: Self-supervised touch representations for vision-based tactile
  sensing
Sparsh: Self-supervised touch representations for vision-based tactile sensing
Carolina Higuera
Akash Sharma
Chaithanya Krishna Bodduluri
Taosha Fan
Patrick E. Lancaster
...
Michael Kaess
Byron Boots
Mike Lambeta
Tingfan Wu
Mustafa Mukadam
29
11
0
31 Oct 2024
Frozen-DETR: Enhancing DETR with Image Understanding from Frozen
  Foundation Models
Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models
Shenghao Fu
Junkai Yan
Q. Yang
Xihan Wei
Xiaohua Xie
Wei-Shi Zheng
VLM
18
3
0
25 Oct 2024
SRA: A Novel Method to Improve Feature Embedding in Self-supervised
  Learning for Histopathological Images
SRA: A Novel Method to Improve Feature Embedding in Self-supervised Learning for Histopathological Images
Hamid Manoochehri
Bodong Zhang
Beatrice S. Knudsen
Tolga Tasdizen
16
1
0
23 Oct 2024
From Attention to Activation: Unravelling the Enigmas of Large Language
  Models
From Attention to Activation: Unravelling the Enigmas of Large Language Models
Prannay Kaul
Chengcheng Ma
Ismail Elezi
Jiankang Deng
18
2
0
22 Oct 2024
Random Token Fusion for Multi-View Medical Diagnosis
Random Token Fusion for Multi-View Medical Diagnosis
Jingyu Guo
Christos Matsoukas
Fredrik Strand
Kevin Smith
MedIm
19
0
0
21 Oct 2024
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
SINGAPO: Single Image Controlled Generation of Articulated Parts in Objects
Jiayi Liu
Denys Iliash
Angel X. Chang
Manolis Savva
Ali Mahdavi-Amiri
46
7
0
21 Oct 2024
Previous
12345
Next