ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2202.06709
  4. Cited By
How Do Vision Transformers Work?
v1v2v3v4 (latest)

How Do Vision Transformers Work?

International Conference on Learning Representations (ICLR), 2022
14 February 2022
Namuk Park
Songkuk Kim
    ViT
ArXiv (abs)PDFHTMLGithub (815★)

Papers citing "How Do Vision Transformers Work?"

50 / 258 papers shown
Rethinking the Use of Vision Transformers for AI-Generated Image Detection
Rethinking the Use of Vision Transformers for AI-Generated Image Detection
NaHyeon Park
Kunhee Kim
Junsuk Choe
Hyunjung Shim
DiffM
149
0
0
04 Dec 2025
LAHNet: Local Attentive Hashing Network for Point Cloud Registration
LAHNet: Local Attentive Hashing Network for Point Cloud Registration
Wentao Qu
Xiaoshui Huang
Liang Xiao
3DPC
127
0
0
30 Nov 2025
Frequency-Aware Token Reduction for Efficient Vision Transformer
Frequency-Aware Token Reduction for Efficient Vision Transformer
Dong-Jae Lee
Jiwan Hur
Jaehyun Choi
Jaemyung Yu
Junmo Kim
188
0
0
26 Nov 2025
CountXplain: Interpretable Cell Counting with Prototype-Based Density Map Estimation
CountXplain: Interpretable Cell Counting with Prototype-Based Density Map Estimation
Abdurahman Ali Mohammed
Wallapak Tavanapong
Catherine Fonder
Donald S. Sakaguchi
74
0
0
24 Nov 2025
On the Role of Hidden States of Modern Hopfield Network in Transformer
On the Role of Hidden States of Modern Hopfield Network in Transformer
Tsubasa Masumura
Masato Taki
120
0
0
24 Nov 2025
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation
Zehong Ma
Longhui Wei
Shuai Wang
Shiliang Zhang
Qi Tian
DiffM
137
2
0
24 Nov 2025
DetailSemNet: Elevating Signature Verification through Detail-Semantic IntegrationEuropean Conference on Computer Vision (ECCV), 2025
Meng-Cheng Shih
Tsai-Ling Huang
Yu-Heng Shih
Hong-Han Shuai
Hsuan-Tung Liu
Yi-Ren Yeh
Ching-Chun Huang
151
2
0
20 Nov 2025
Application of Graph Based Vision Transformers Architectures for Accurate Temperature Prediction in Fiber Specklegram Sensors
Application of Graph Based Vision Transformers Architectures for Accurate Temperature Prediction in Fiber Specklegram Sensors
Abhishek Sebastian
141
0
0
15 Nov 2025
SYNAPSE: Synergizing an Adapter and Finetuning for High-Fidelity EEG Synthesis from a CLIP-Aligned Encoder
SYNAPSE: Synergizing an Adapter and Finetuning for High-Fidelity EEG Synthesis from a CLIP-Aligned Encoder
Jeyoung Lee
Hochul Kang
DiffM
72
0
0
11 Nov 2025
UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations
UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations
Fengming Yu
Haiwei Pan
Kejia Zhang
Jian Guan
Haiying Jiang
154
0
0
28 Oct 2025
Exploring and Leveraging Class Vectors for Classifier Editing
Exploring and Leveraging Class Vectors for Classifier Editing
Jaeik Kim
Jaeyoung Do
VLM
193
0
0
13 Oct 2025
Robust RGB-T Tracking via Learnable Visual Fourier Prompt Fine-tuning and Modality Fusion Prompt Generation
Robust RGB-T Tracking via Learnable Visual Fourier Prompt Fine-tuning and Modality Fusion Prompt Generation
Hongtao Yang
Bineng Zhong
Qihua Liang
Zhiruo Zhu
Yaozong Zheng
Ning Li
157
0
0
24 Sep 2025
A Modern Look at Simplicity Bias in Image Classification Tasks
A Modern Look at Simplicity Bias in Image Classification Tasks
Xiaoguang Chang
Teng Wang
Changyin Sun
AAML
138
0
0
13 Sep 2025
Fine-grained Multi-class Nuclei Segmentation with Molecular-empowered All-in-SAM Model
Fine-grained Multi-class Nuclei Segmentation with Molecular-empowered All-in-SAM ModelJournal of Medical Imaging (JMI), 2025
Xueyuan Li
Can Cui
Ruining Deng
Yucheng Tang
Quan Liu
Tianyuan Yao
Shunxing Bao
Naweed Chowdhury
Haichun Yang
Daniel Moyer
VLM
116
2
0
21 Aug 2025
MoCHA-former: Moiré-Conditioned Hybrid Adaptive Transformer for Video Demoiréing
MoCHA-former: Moiré-Conditioned Hybrid Adaptive Transformer for Video Demoiréing
Jeahun Sung
Changhyun Roh
Chanho Eom
Jihyong Oh
237
0
0
20 Aug 2025
Omni Survey for Multimodality Analysis in Visual Object Tracking
Omni Survey for Multimodality Analysis in Visual Object Tracking
Zhangyong Tang
Tianyang Xu
Xuefeng Zhu
Hui Li
Shaochuan Zhao
Tao Zhou
Chunyang Cheng
Xiaojun Wu
Josef Kittler
190
2
0
18 Aug 2025
Cross-Architecture Distillation Made Simple with Redundancy Suppression
Cross-Architecture Distillation Made Simple with Redundancy Suppression
Weijia Zhang
Yuehao Liu
Wu Ran
Chao Ma
185
2
0
29 Jul 2025
Frequency-Dynamic Attention Modulation for Dense Prediction
Frequency-Dynamic Attention Modulation for Dense Prediction
Linwei Chen
Lin Gu
Ying Fu
552
3
0
16 Jul 2025
FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed
FastDINOv2: Frequency Based Curriculum Learning Improves Robustness and Training Speed
Jiaqi Zhang
Juntuo Wang
Zhixin Sun
John Zou
Randall Balestriero
140
0
0
04 Jul 2025
Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting
Frequency-Aligned Knowledge Distillation for Lightweight Spatiotemporal Forecasting
Yuqi Li
Chuanguang Yang
Hansheng Zeng
Zeyu Dong
Zhulin An
Yongjun Xu
Yingli Tian
Hao Wu
AI4TS
267
25
0
27 Jun 2025
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental Learning
CL-LoRA: Continual Low-Rank Adaptation for Rehearsal-Free Class-Incremental LearningComputer Vision and Pattern Recognition (CVPR), 2025
Jiangpeng He
Zhihao Duan
Fengqing M Zhu
CLL
188
6
0
30 May 2025
Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel Imaging
Proximal Algorithm Unrolling: Flexible and Efficient Reconstruction Networks for Single-Pixel ImagingComputer Vision and Pattern Recognition (CVPR), 2025
Ping Wang
Lishun Wang
Gang Qu
Xiaodong Wang
Yulun Zhang
Xin Yuan
140
4
0
29 May 2025
Locality-Aware Zero-Shot Human-Object Interaction Detection
Locality-Aware Zero-Shot Human-Object Interaction DetectionComputer Vision and Pattern Recognition (CVPR), 2025
Sanghyun Kim
Deunsol Jung
Minsu Cho
VLM
359
3
0
26 May 2025
Understanding Differential Transformer Unchains Pretrained Self-Attentions
Understanding Differential Transformer Unchains Pretrained Self-Attentions
Chaerin Kong
Jiho Jang
Nojun Kwak
459
0
0
22 May 2025
Learning Spatio-Temporal Dynamics for Trajectory Recovery via Time-Aware Transformer
Learning Spatio-Temporal Dynamics for Trajectory Recovery via Time-Aware Transformer
Tian Sun
Yuqi Chen
Baihua Zheng
Weiwei Sun
162
3
0
20 May 2025
Towards Quantifying the Hessian Structure of Neural Networks
Towards Quantifying the Hessian Structure of Neural Networks
Zhaorui Dong
Yushun Zhang
Jianfeng Yao
Jianfeng Yao
303
3
0
05 May 2025
CVVNet: A Cross-Vertical-View Network for Gait Recognition
CVVNet: A Cross-Vertical-View Network for Gait Recognition
Xuelong Li
Wei Song
Yingda Huang
Wei Meng
Le Chang
Hongyang Li
CVBM
282
1
0
03 May 2025
Exploring Synergistic Ensemble Learning: Uniting CNNs, MLP-Mixers, and Vision Transformers to Enhance Image Classification
Exploring Synergistic Ensemble Learning: Uniting CNNs, MLP-Mixers, and Vision Transformers to Enhance Image Classification
Mk Bashar
Ocean Monjur
Samia Islam
Mohammad Galib Shams
Niamul Quader
UQCV
254
1
0
12 Apr 2025
Spectral-Adaptive Modulation Networks for Visual Perception
Spectral-Adaptive Modulation Networks for Visual Perception
Guhnoo Yun
J. Yoo
Kijung Kim
Jeongho Lee
Paul Hongsuck Seo
Dong Hwan Kim
426
0
0
31 Mar 2025
Filtering with Time-frequency Analysis: An Adaptive and Lightweight Model for Sequential Recommender Systems Based on Discrete Wavelet Transform
Filtering with Time-frequency Analysis: An Adaptive and Lightweight Model for Sequential Recommender Systems Based on Discrete Wavelet TransformInternational Conference on Intelligent Computing (ICIC), 2025
Sheng Lu
Mingxi Ge
Jiuyi Zhang
Wanli Zhu
Guanjin Li
Fangming Gu
AI4TS
543
1
0
30 Mar 2025
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers
BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion TransformersComputer Vision and Pattern Recognition (CVPR), 2025
Hui Zhang
Tingwei Gao
Jie Shao
Zuxuan Wu
351
11
0
20 Mar 2025
Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras
Learning to Efficiently Adapt Foundation Models for Self-Supervised Endoscopic 3D Scene Reconstruction from Any Cameras
Beilei Cui
Long Bai
Mobarakol Islam
An-Chi Wang
Tianhao Shen
...
Feng Li
Daming Gao
Zhongliang Jiang
Nassir Navab
Hongliang Ren
MedIm
240
0
0
20 Mar 2025
FEB-Cache: Frequency-Guided Exposure Bias Reduction for Enhancing Diffusion Transformer Caching
FEB-Cache: Frequency-Guided Exposure Bias Reduction for Enhancing Diffusion Transformer Caching
Zhen Zou
Hu Yu
310
0
0
10 Mar 2025
Spatial-Spectral Diffusion Contrastive Representation Network for Hyperspectral Image Classification
Spatial-Spectral Diffusion Contrastive Representation Network for Hyperspectral Image ClassificationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025
Yimin Zhu
Linlin Xu
DiffM
379
2
0
27 Feb 2025
Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose Estimation
Learning Structure-Supporting Dependencies via Keypoint Interactive Transformer for General Mammal Pose EstimationInternational Journal of Computer Vision (IJCV), 2025
Tianyang Xu
Jiyong Rao
Xiaoning Song
Zhenhua Feng
Rui Wang
ViT
412
3
0
25 Feb 2025
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Understanding Why Adam Outperforms SGD: Gradient Heterogeneity in Transformers
Akiyoshi Tomihari
Issei Sato
ODL
706
5
0
31 Jan 2025
Keypoint Aware Masked Image Modelling
Keypoint Aware Masked Image ModellingIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024
Madhava Krishna
Convin.AI
454
1
0
03 Jan 2025
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and Metrics
Navigating the Maze of Explainable AI: A Systematic Approach to Evaluating Methods and MetricsNeural Information Processing Systems (NeurIPS), 2024
Lukas Klein
Carsten T. Lüth
U. Schlegel
Till J. Bungert
Mennatallah El-Assady
Paul F. Jäger
XAIELM
648
16
0
03 Jan 2025
Prompt Categories Cluster for Weakly Supervised Semantic Segmentation
Prompt Categories Cluster for Weakly Supervised Semantic Segmentation
Wangyu Wu
Xianglin Qiu
Siqi Song
Xiaowei Huang
Fei Ma
Jimin Xiao
VLM
584
26
0
18 Dec 2024
Adaptive High-Pass Kernel Prediction for Efficient Video Deblurring
Adaptive High-Pass Kernel Prediction for Efficient Video Deblurring
Bo Ji
Angela Yao
371
0
0
02 Dec 2024
Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through
  Frequency-Based Adaptation
Enhancing Parameter-Efficient Fine-Tuning of Vision Transformers through Frequency-Based Adaptation
S. Ly
Hien Nguyen
330
4
0
28 Nov 2024
D-Cube: Exploiting Hyper-Features of Diffusion Model for Robust Medical ClassificationIndustrial Conference on Data Mining (IDM), 2024
Minhee Jang
Juheon Son
Thanaporn Viriyasaranon
Junho Kim
Jang-Hwan Choi
MedIm
346
0
0
17 Nov 2024
Freqformer: Frequency-Domain Transformer for 3-D Reconstruction and Quantification of Human Retinal Vasculature
Lingyun Wang
Bingjie Wang
Jay Chhablani
J. Sahel
Shaohua Pi
MedIm
215
2
0
17 Nov 2024
Where Do Large Learning Rates Lead Us?
Where Do Large Learning Rates Lead Us?Neural Information Processing Systems (NeurIPS), 2024
Ildus Sadrtdinov
M. Kodryan
Eduard Pokonechny
E. Lobacheva
Dmitry Vetrov
AI4CE
331
5
0
29 Oct 2024
Depth Attention for Robust RGB Tracking
Depth Attention for Robust RGB TrackingAsian Conference on Computer Vision (ACCV), 2024
Yu Liu
Arif Mahmood
Muhammad Haris Khan
VOSMDE
313
1
0
27 Oct 2024
In Search of the Successful Interpolation: On the Role of Sharpness in
  CLIP Generalization
In Search of the Successful Interpolation: On the Role of Sharpness in CLIP Generalization
Alireza Abdollahpoorrostam
239
0
0
21 Oct 2024
Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation
Fuse Before Transfer: Knowledge Fusion for Heterogeneous Distillation
Guopeng Li
Qiang Wang
K. Yan
Shouhong Ding
Yuan Gao
Gui-Song Xia
407
0
0
16 Oct 2024
CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency Patching
CATCH: Channel-Aware multivariate Time Series Anomaly Detection via Frequency PatchingInternational Conference on Learning Representations (ICLR), 2024
Xingjian Wu
Xiangfei Qiu
Zhengyu Li
Yihang Wang
Jilin Hu
Chenjuan Guo
Hui Xiong
Bin Yang
AI4TS
576
57
0
16 Oct 2024
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis
What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian AnalysisInternational Conference on Learning Representations (ICLR), 2024
Weronika Ormaniec
Felix Dangel
Sidak Pal Singh
543
9
0
14 Oct 2024
Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous
  AR/VR Devices
Neural Architecture Search of Hybrid Models for NPU-CIM Heterogeneous AR/VR Devices
Yiwei Zhao
Ziyun Li
Win-San Khwa
Xiaoyu Sun
Sai Qian Zhang
...
Jorge Gomez
Jae-sun Seo
Phillip B. Gibbons
B. D. Salvo
Chiao Liu
124
6
0
10 Oct 2024
123456
Next