v1v2v3v4 (latest)

How Do Vision Transformers Work?

International Conference on Learning Representations (ICLR), 2022

14 February 2022

Namuk Park

Songkuk Kim

ViT

ArXiv (abs)PDF HTML Github (815★)

Papers citing "How Do Vision Transformers Work?"

50 / 258 papers shown

AttentionViz: A Global View of Transformer AttentionIEEE Transactions on Visualization and Computer Graphics (TVCG), 2023

325

04 May 2023

Learngene: Inheriting Condensed Knowledge from the Ancestry Model to Descendant Models

Jing Wang

217

03 May 2023

What Do Self-Supervised Vision Transformers Learn?International Conference on Learning Representations (ICLR), 2023

301

103

01 May 2023

Depth-Relative Self Attention for Monocular Depth EstimationInternational Joint Conference on Artificial Intelligence (IJCAI), 2023

185

25 Apr 2023

Benchmarking Low-Shot Robustness to Natural Distribution ShiftsIEEE International Conference on Computer Vision (ICCV), 2023

Aaditya K. Singh

Kartik Sarangmath

Prithvijit Chattopadhyay

Judy Hoffman

OOD

311

21 Apr 2023

GlobalMind: Global Multi-head Interactive Self-attention Network for Hyperspectral Change DetectionIsprs Journal of Photogrammetry and Remote Sensing (ISPRS J. Photogramm. Remote Sens.), 2023

Meiqi Hu

Chen Wu

Guang Dai

326

18 Apr 2023

A Unified HDR Imaging Method with Pixel and Patch LevelComputer Vision and Pattern Recognition (CVPR), 2023

Qingsen Yan

132

14 Apr 2023

Dynamic Mobile-Former: Strengthening Dynamic Convolution with Attention and Residual Connection in Kernel Space

Seokju Yun

Youngmin Ro

ViT

158

13 Apr 2023

Simulated Annealing in Early Layers Leads to Better GeneralizationComputer Vision and Pattern Recognition (CVPR), 2023

Mirco Ravanelli

195

10 Apr 2023

Rethinking Evaluation Protocols of Visual Representations Learned via Self-supervised Learning

221

07 Apr 2023

APPT : Asymmetric Parallel Point Transformer for 3D Point Cloud Understanding

228

31 Mar 2023

ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain GeneralizationComputer Vision and Pattern Recognition (CVPR), 2023

Jintao Guo

Na Wang

Lei Qi

Yinghuan Shi

292

21 Mar 2023

DS-TDNN: Dual-stream Time-delay Neural Network with Global-aware Filter for Speaker VerificationIEEE/ACM Transactions on Audio Speech and Language Processing (TASLP), 2023

Yangfu Li

Jiapan Gan

Xiaodan Lin

283

20 Mar 2023

SRFormer: Permuted Self-Attention for Single Image Super-Resolution

Ming-Ming Cheng

124

17 Mar 2023

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile DevicesIEEE International Conference on Computer Vision (ICCV), 2023

Chen Tang

Huiqiang Jiang

Yuqing Yang

153

17 Mar 2023

Rethinking Optical Flow from Geometric Matching Consistent PerspectiveComputer Vision and Pattern Recognition (CVPR), 2023

Qiaole Dong

Chenjie Cao

Yanwei Fu

270

15 Mar 2023

Masked Image Modeling with Local Multi-Scale ReconstructionComputer Vision and Pattern Recognition (CVPR), 2023

202

09 Mar 2023

Point Cloud Classification Using Content-based Transformer via Clustering in Feature SpaceIEEE/CAA Journal of Automatica Sinica (IEEE/CAA JAS), 2023

Yisheng Lv

Feiyue Wang

256

08 Mar 2023

FFT-based Dynamic Token Mixer for VisionAAAI Conference on Artificial Intelligence (AAAI), 2023

Yuki Tatsunami

Masato Taki

306

07 Mar 2023

Self-attention in Vision Transformers Performs Perceptual Grouping, Not Attention

Paria Mehrani

John K. Tsotsos

228

02 Mar 2023

Understanding plasticity in neural networksInternational Conference on Machine Learning (ICML), 2023

522

136

02 Mar 2023

Token Contrast for Weakly-Supervised Semantic SegmentationComputer Vision and Pattern Recognition (CVPR), 2023

Lixiang Ru

Heliang Zheng

Yibing Zhan

Bo Du

ViT

334

140

02 Mar 2023

Swin Deformable Attention Hybrid U-Net for Medical Image SegmentationSymposium on Medical Information Processing and Analysis (MIPA), 2023

Lichao Wang

Jiahao Huang

Xiaodan Xing

Guang Yang

119

28 Feb 2023

Teacher Intervention: Improving Convergence of Quantization Aware Training for Ultra-Low Precision TransformersConference of the European Chapter of the Association for Computational Linguistics (EACL), 2023

141

23 Feb 2023

MedViT: A Robust Vision Transformer for Generalized Medical Image Classification

271

320

19 Feb 2023

Efficiency 360: Efficient Vision Transformers

Badri N. Patro

Vijay Srinivas Agneeswaran

406

16 Feb 2023

TFormer: A Transmission-Friendly ViT Model for IoT DevicesIEEE Transactions on Parallel and Distributed Systems (TPDS), 2023

184

15 Feb 2023

Self-supervised pseudo-colorizing of masked cellsPLoS ONE (PLoS ONE), 2023

Royden Wagner

Carlos Fernandez Lopez

Christoph Stiller

143

12 Feb 2023

Revisiting Image Deblurring with an Efficient ConvNet

Hans-peter Seidel

160

04 Feb 2023

Longformer: Longitudinal Transformer for Alzheimer's Disease Classification with Structural MRIsIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

Qiu-hui Chen

Yi Hong

MedIm

291

02 Feb 2023

Enhancing Face Recognition with Latent Space Data Augmentation and Facial Posture ReconstructionExpert systems with applications (ESWA), 2023

Soroush Hashemifar

Abdolreza Marefat

Javad Hassannataj Joloudari

H. Hassanpour

CVBM

328

27 Jan 2023

A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction

Shijie Wang

192

24 Jan 2023

Koopman neural operator as a mesh-free solver of non-linear partial differential equationsJournal of Computational Physics (JCP), 2023

344

24 Jan 2023

Learning to Exploit Temporal Structure for Biomedical Vision-Language ProcessingComputer Vision and Pattern Recognition (CVPR), 2023

Shruthi Bannur

Stephanie L. Hyland

Qianchu Liu

Fernando Pérez-García

Maximilian Ilse

...

Maria T. A. Wetscherek

313

210

11 Jan 2023

KoopmanLab: machine learning for solving complex physics equationsAPL Machine Learning (AML), 2023

356

03 Jan 2023

Representation Separation for Semantic Segmentation with Vision Transformers

179

28 Dec 2022

Investigation of Network Architecture for Multimodal Head-and-Neck Tumor SegmentationNuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2022

174

21 Dec 2022

What do Vision Transformers Learn? A Visual Exploration

261

13 Dec 2022

Towards Practical Plug-and-Play Diffusion ModelsComputer Vision and Pattern Recognition (CVPR), 2022

312

12 Dec 2022

Non-equispaced Fourier Neural Solvers for PDEs

Lirong Wu

Siyuan Li

239

09 Dec 2022

Group Generalized Mean Pooling for Vision Transformer

301

08 Dec 2022

Teaching Matters: Investigating the Role of Supervision in Vision TransformersComputer Vision and Pattern Recognition (CVPR), 2022

378

07 Dec 2022

Pivotal Role of Language Modeling in Recommender Systems: Enriching Task-specific and Task-agnostic Representation LearningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022

402

07 Dec 2022

Minority-Oriented Vicinity Expansion with Attentive Aggregation for Video Long-Tailed RecognitionAAAI Conference on Artificial Intelligence (AAAI), 2022

178

24 Nov 2022

Beyond the Field-of-View: Enhancing Scene Visibility and Perception with Clip-Recurrent TransformerIEEE Transactions on Intelligent Vehicles (IEEE Trans. Intell. Veh.), 2022

Kailun Yang

Kaiwei Wang

263

21 Nov 2022

Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer EncodersConference on Empirical Methods in Natural Language Processing (EMNLP), 2022

135

20 Nov 2022

Vision Transformers in Medical Imaging: A Review

257

18 Nov 2022

MogaNet: Multi-order Gated Aggregation NetworkInternational Conference on Learning Representations (ICLR), 2022

285

124

07 Nov 2022

ViT-LSLA: Vision Transformer with Light Self-Limited-Attention

116

31 Oct 2022

Contextual Learning in Fourier Complex Field for VHR Remote Sensing ImagesIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2022

Xinbo Gao

142

28 Oct 2022