Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2202.06709
Cited By
v1
v2
v3
v4 (latest)
How Do Vision Transformers Work?
International Conference on Learning Representations (ICLR), 2022
14 February 2022
Namuk Park
Songkuk Kim
ViT
Re-assign community
ArXiv (abs)
PDF
HTML
Github (815★)
Papers citing
"How Do Vision Transformers Work?"
50 / 258 papers shown
AttentionViz: A Global View of Transformer Attention
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2023
Catherine Yeh
Yida Chen
Aoyu Wu
Cynthia Chen
Fernanda Viégas
Martin Wattenberg
ViT
324
89
0
04 May 2023
Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models
Qiufeng Wang
Xu Yang
Shuxia Lin
Jing Wang
Xin Geng
216
18
0
03 May 2023
What Do Self-Supervised Vision Transformers Learn?
International Conference on Learning Representations (ICLR), 2023
Namuk Park
Wonjae Kim
Byeongho Heo
Taekyung Kim
Sangdoo Yun
SSL
300
103
1
01 May 2023
Depth-Relative Self Attention for Monocular Depth Estimation
International Joint Conference on Artificial Intelligence (IJCAI), 2023
Kyuhong Shim
Jiyoung Kim
Gusang Lee
B. Shim
MDE
178
8
0
25 Apr 2023
Benchmarking Low-Shot Robustness to Natural Distribution Shifts
IEEE International Conference on Computer Vision (ICCV), 2023
Aaditya K. Singh
Kartik Sarangmath
Prithvijit Chattopadhyay
Judy Hoffman
OOD
311
3
0
21 Apr 2023
GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change Detection
Isprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023
Meiqi Hu
Chen Wu
Guang Dai
326
46
0
18 Apr 2023
A Unified HDR Imaging Method with Pixel and Patch Level
Computer Vision and Pattern Recognition (CVPR), 2023
Qingsen Yan
Weiye Chen
Song Zhang
Yu Zhu
Jinqiu Sun
Yanning Zhang
131
33
0
14 Apr 2023
Dynamic Mobile-Former: Strengthening Dynamic Convolution with Attention and Residual Connection in Kernel Space
Seokju Yun
Youngmin Ro
ViT
154
2
0
13 Apr 2023
Simulated Annealing in Early Layers Leads to Better Generalization
Computer Vision and Pattern Recognition (CVPR), 2023
Amirm. Sarfi
Zahra Karimpour
Muawiz Chaudhary
N. Khalid
Mirco Ravanelli
Sudhir Mudur
Eugene Belilovsky
AI4CE
CLL
195
10
0
10 Apr 2023
Rethinking Evaluation Protocols of Visual Representations Learned via Self-supervised Learning
Jaehoon Lee
Doyoung Yoon
Byeongmoon Ji
Kyungyul Kim
Sangheum Hwang
SSL
221
4
0
07 Apr 2023
APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding
Hengjia Li
Tu Zheng
Zhihao Chi
Zheng Yang
Wenxiao Wang
Boxi Wu
Binbin Lin
Deng Cai
3DPC
222
1
0
31 Mar 2023
ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization
Computer Vision and Pattern Recognition (CVPR), 2023
Jintao Guo
Na Wang
Lei Qi
Yinghuan Shi
291
61
0
21 Mar 2023
DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker Verification
IEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023
Yangfu Li
Jiapan Gan
Xiaodan Lin
275
9
0
20 Mar 2023
SRFormer: Permuted Self-Attention for Single Image Super-Resolution
Yupeng Zhou
Zerui Li
Chunle Guo
S. Bai
Ming-Ming Cheng
Qibin Hou
124
37
0
17 Mar 2023
ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices
IEEE International Conference on Computer Vision (ICCV), 2023
Chen Tang
Li Zhang
Huiqiang Jiang
Jiahang Xu
Ting Cao
Quanlu Zhang
Yuqing Yang
Zhi Wang
Mao Yang
153
14
0
17 Mar 2023
Rethinking Optical Flow from Geometric Matching Consistent Perspective
Computer Vision and Pattern Recognition (CVPR), 2023
Qiaole Dong
Chenjie Cao
Yanwei Fu
270
55
0
15 Mar 2023
Masked Image Modeling with Local Multi-Scale Reconstruction
Computer Vision and Pattern Recognition (CVPR), 2023
Haoqing Wang
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhiwei Deng
Kai Han
199
68
0
09 Mar 2023
Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space
IEEE/CAA Journal of Automatica Sinica (IEEE/CAA JAS), 2023
Yahui Liu
Bin Wang
Yisheng Lv
Lingxi Li
Feiyue Wang
ViT
3DPC
253
74
0
08 Mar 2023
FFT-based Dynamic Token Mixer for Vision
AAAI Conference on Artificial Intelligence (AAAI), 2023
Yuki Tatsunami
Masato Taki
303
53
0
07 Mar 2023
Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention
Paria Mehrani
John K. Tsotsos
228
39
0
02 Mar 2023
Understanding plasticity in neural networks
International Conference on Machine Learning (ICML), 2023
Clare Lyle
Zeyu Zheng
Evgenii Nikishin
Bernardo Avila-Pires
Razvan Pascanu
Will Dabney
AI4CE
507
136
0
02 Mar 2023
Token Contrast for Weakly-Supervised Semantic Segmentation
Computer Vision and Pattern Recognition (CVPR), 2023
Lixiang Ru
Heliang Zheng
Yibing Zhan
Bo Du
ViT
333
140
0
02 Mar 2023
Swin Deformable Attention Hybrid U-Net for Medical Image Segmentation
Symposium on Medical Information Processing and Analysis (MIPA), 2023
Lichao Wang
Jiahao Huang
Xiaodan Xing
Guang Yang
117
4
0
28 Feb 2023
Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision Transformers
Conference of the European Chapter of the Association for Computational Linguistics (EACL), 2023
Minsoo Kim
Kyuhong Shim
Seongmin Park
Wonyong Sung
Jungwook Choi
MQ
141
2
0
23 Feb 2023
MedViT: A Robust Vision Transformer for Generalized Medical Image Classification
Omid Nejati Manzari
Hamid Ahmadabadi
Hossein Kashiani
S. B. Shokouhi
Ahmad Ayatollahi
ViT
MedIm
268
317
0
19 Feb 2023
Efficiency 360: Efficient Vision Transformers
Badri N. Patro
Vijay Srinivas Agneeswaran
405
7
0
16 Feb 2023
TFormer: A Transmission-Friendly ViT Model for IoT Devices
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2023
Zhichao Lu
Chuntao Ding
Felix Juefei Xu
Vishnu Boddeti
Shangguang Wang
Yun Yang
183
21
0
15 Feb 2023
Self-supervised pseudo-colorizing of masked cells
PLoS ONE (PLoS ONE), 2023
Royden Wagner
Carlos Fernandez Lopez
Christoph Stiller
139
1
0
12 Feb 2023
Revisiting Image Deblurring with an Efficient ConvNet
Lingyan Ruan
Mojtaba Bemana
Hans-peter Seidel
K. Myszkowski
Bin Chen
160
27
0
04 Feb 2023
Longformer: Longitudinal Transformer for Alzheimer's Disease Classification with Structural MRIs
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Qiu-hui Chen
Yi Hong
MedIm
287
26
0
02 Feb 2023
Enhancing Face Recognition with Latent Space Data Augmentation and Facial Posture Reconstruction
Expert systems with applications (ESWA), 2023
Soroush Hashemifar
Abdolreza Marefat
Javad Hassannataj Joloudari
H. Hassanpour
CVBM
325
14
0
27 Jan 2023
A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction
Junyu Wang
Shijie Wang
Wenyu Liu
Zengqiang Zheng
Xinggang Wang
192
3
0
24 Jan 2023
Koopman neural operator as a mesh-free solver of non-linear partial differential equations
Journal of Computational Physics (JCP), 2023
Wei Xiong
Xiaomeng Huang
Ziyang Zhang
Ruixuan Deng
Pei Sun
Yang Tian
AI4CE
339
52
0
24 Jan 2023
Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing
Computer Vision and Pattern Recognition (CVPR), 2023
Shruthi Bannur
Stephanie L. Hyland
Qianchu Liu
Fernando Pérez-García
Maximilian Ilse
...
Maria T. A. Wetscherek
M. Lungren
A. Nori
Javier Alvarez-Valle
Ozan Oktay
313
207
0
11 Jan 2023
KoopmanLab: machine learning for solving complex physics equations
APL Machine Learning (AML), 2023
Wei Xiong
Muyuan Ma
Xiaomeng Huang
Ziyang Zhang
Pei Sun
Yang Tian
AI4CE
355
19
0
03 Jan 2023
Representation Separation for Semantic Segmentation with Vision Transformers
Yuanduo Hong
Huihui Pan
Weichao Sun
Xinghu Yu
Huijun Gao
ViT
179
5
0
28 Dec 2022
Investigation of Network Architecture for Multimodal Head-and-Neck Tumor Segmentation
Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2022
Ye Li
Junyu Chen
Se-In Jang
Kuang Gong
Shijie Zhao
ViT
MedIm
171
1
0
21 Dec 2022
What do Vision Transformers Learn? A Visual Exploration
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
ViT
259
79
0
13 Dec 2022
Towards Practical Plug-and-Play Diffusion Models
Computer Vision and Pattern Recognition (CVPR), 2022
Hyojun Go
Yunsung Lee
Jin-Young Kim
Seunghyun Lee
Myeongho Jeong
Hyun Seung Lee
Seungtaek Choi
DiffM
311
21
0
12 Dec 2022
Non-equispaced Fourier Neural Solvers for PDEs
Haitao Lin
Lirong Wu
Yongjie Xu
Yufei Huang
Siyuan Li
Guojiang Zhao
Z. Stan
234
8
0
09 Dec 2022
Group Generalized Mean Pooling for Vision Transformer
ByungSoo Ko
Han-Gyu Kim
Byeongho Heo
Sangdoo Yun
Sanghyuk Chun
Geonmo Gu
Wonjae Kim
ViT
295
3
0
08 Dec 2022
Teaching Matters: Investigating the Role of Supervision in Vision Transformers
Computer Vision and Pattern Recognition (CVPR), 2022
Matthew Walmer
Saksham Suri
Kamal Gupta
Abhinav Shrivastava
377
39
0
07 Dec 2022
Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2022
Kyuyong Shin
Hanock Kwak
Wonjae Kim
Jisu Jeong
Seungjae Jung
KyungHyun Kim
Jung-Woo Ha
Sang-Woo Lee
399
6
0
07 Dec 2022
Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed Recognition
AAAI Conference on Artificial Intelligence (AAAI), 2022
WonJun Moon
Hyun Seok Seong
Jae-Pil Heo
VLM
178
6
0
24 Nov 2022
Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent Transformer
IEEE Transactions on Intelligent Vehicles (IEEE Trans. Intell. Veh.), 2022
Haowen Shi
Zhijie Xu
Kailun Yang
Xiaoyue Yin
Ze Wang
Kaiwei Wang
ViT
263
5
0
21 Nov 2022
Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022
Minsoo Kim
Sihwa Lee
S. Hong
Duhyeuk Chang
Jungwook Choi
MQ
135
14
0
20 Nov 2022
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
257
56
0
18 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
International Conference on Learning Representations (ICLR), 2022
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
285
124
0
07 Nov 2022
ViT-LSLA: Vision Transformer with Light Self-Limited-Attention
Zhenzhe Hechen
Wei Huang
Yixin Zhao
ViT
113
9
0
31 Oct 2022
Contextual Learning in Fourier Complex Field for VHR Remote Sensing Images
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022
Yan Zhang
Xiyuan Gao
Qingyan Duan
Jiaxu Leng
Xiao Pu
Xinbo Gao
ViT
142
1
0
28 Oct 2022
Previous
1
2
3
4
5
6
Next