Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.08810
Cited By
Do Vision Transformers See Like Convolutional Neural Networks?
19 August 2021
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Vision Transformers See Like Convolutional Neural Networks?"
50 / 440 papers shown
Title
Vision Transformer-based Feature Extraction for Generalized Zero-Shot Learning
Jiseob Kim
Kyuhong Shim
Junhan Kim
B. Shim
ViT
19
12
0
02 Feb 2023
Inference Time Evidences of Adversarial Attacks for Forensic on Transformers
Hugo Lemarchant
Liang Li
Yiming Qian
Yuta Nakashima
Hajime Nagahara
ViT
AAML
38
0
0
31 Jan 2023
Cross-Architectural Positive Pairs improve the effectiveness of Self-Supervised Learning
P. Singh
Jacopo Cirrone
SSL
40
0
0
27 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
31
2
0
26 Jan 2023
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Reza Azad
A. Kazerouni
Moein Heidari
Ehsan Khodapanah Aghdam
Amir Molaei
Yiwei Jia
Abin Jose
Rijo Roy
Dorit Merhof
MedIm
ViT
30
161
0
09 Jan 2023
A Study on the Generality of Neural Network Structures for Monocular Depth Estimation
Ji-Hoon Bae
K. Hwang
Sunghoon Im
MDE
21
7
0
09 Jan 2023
Skip-Attention: Improving Vision Transformers by Paying Less Attention
Shashanka Venkataramanan
Amir Ghodrati
Yuki M. Asano
Fatih Porikli
A. Habibian
ViT
15
25
0
05 Jan 2023
Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis
Ta-Chung Chi
Ting-Han Fan
Alexander I. Rudnicky
Peter J. Ramadge
25
40
0
20 Dec 2022
Rethinking Cooking State Recognition with Vision Transformers
A. Khan
Alif Ashrafee
Reeshoon Sayera
Shahriar Ivan
Sabbir Ahmed
ViT
19
7
0
16 Dec 2022
Two-stage Contextual Transformer-based Convolutional Neural Network for Airway Extraction from CT Images
Yanan Wu
Shuiqing Zhao
Shouliang Qi
Jie Feng
H. Pang
...
Long Bai
Meng-Yi Li
Shuyue Xia
W. Qian
Hongliang Ren
ViT
MedIm
19
24
0
15 Dec 2022
What do Vision Transformers Learn? A Visual Exploration
Amin Ghiasi
Hamid Kazemi
Eitan Borgnia
Steven Reich
Manli Shu
Micah Goldblum
A. Wilson
Tom Goldstein
ViT
24
60
0
13 Dec 2022
HeartBEiT: Vision Transformer for Electrocardiogram Data Improves Diagnostic Performance at Low Sample Sizes
A. Vaid
Joy Jiang
Ashwin S. Sawant
S. Lerakis
E. Argulian
...
Alexander W. Charney
H. Greenspan
Benjamin S. Glicksberg
T. University
Israel. Division of Nephrology
MedIm
15
3
0
13 Dec 2022
CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet
Xiaoyi Dong
Jianmin Bao
Ting Zhang
Dongdong Chen
Shuyang Gu
Weiming Zhang
Lu Yuan
Dong Chen
Fang Wen
Nenghai Yu
CLIP
22
35
0
12 Dec 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers
Yasheng Sun
Hang Zhou
Kaisiyuan Wang
Qianyi Wu
Zhibin Hong
Jingtuo Liu
Errui Ding
Jingdong Wang
Ziwei Liu
Koike Hideki
27
34
0
09 Dec 2022
Group Generalized Mean Pooling for Vision Transformer
ByungSoo Ko
Han-Gyu Kim
Byeongho Heo
Sangdoo Yun
Sanghyuk Chun
Geonmo Gu
Wonjae Kim
ViT
25
1
0
08 Dec 2022
Teaching Matters: Investigating the Role of Supervision in Vision Transformers
Matthew Walmer
Saksham Suri
Kamal Gupta
Abhinav Shrivastava
30
33
0
07 Dec 2022
CLIPascene: Scene Sketching with Different Types and Levels of Abstraction
Yael Vinker
Yuval Alaluf
Daniel Cohen-Or
Ariel Shamir
CLIP
19
54
0
30 Nov 2022
Hierarchical Transformer for Survival Prediction Using Multimodality Whole Slide Images and Genomics
Chunyuan Li
Xinliang Zhu
Jiawen Yao
Junzhou Huang
MedIm
30
11
0
29 Nov 2022
Finding Differences Between Transformers and ConvNets Using Counterfactual Simulation Testing
Nataniel Ruiz
Sarah Adel Bargal
Cihang Xie
Kate Saenko
Stan Sclaroff
ViT
28
5
0
29 Nov 2022
Adaptive Attention Link-based Regularization for Vision Transformers
Heegon Jin
Jongwon Choi
ViT
14
0
0
25 Nov 2022
Ladder Siamese Network: a Method and Insights for Multi-level Self-Supervised Learning
Ryota Yoshihashi
Shuhei Nishimura
Dai Yonebayashi
Yuya Otsuka
Tomohiro Tanaka
Takashi Miyazaki
SSL
24
2
0
25 Nov 2022
ModelDiff: A Framework for Comparing Learning Algorithms
Harshay Shah
Sung Min Park
Andrew Ilyas
A. Madry
SyDa
46
26
0
22 Nov 2022
TFormer: A throughout fusion transformer for multi-modal skin lesion diagnosis
Yilan Zhang
Feng-ying Xie
Jianqing Chen
MedIm
12
32
0
21 Nov 2022
Vision Transformer with Super Token Sampling
Huaibo Huang
Xiaoqiang Zhou
Jie Cao
Ran He
T. Tan
ViT
18
55
0
21 Nov 2022
Frozen Overparameterization: A Double Descent Perspective on Transfer Learning of Deep Neural Networks
Yehuda Dar
Lorenzo Luzi
Richard G. Baraniuk
AI4CE
8
1
0
20 Nov 2022
Delving into Transformer for Incremental Semantic Segmentation
Zekai Xu
Mingying Zhang
Jiayue Hou
Xing Gong
Chuan Wen
Chengjie Wang
Junge Zhang
CLL
19
1
0
18 Nov 2022
Vision Transformers in Medical Imaging: A Review
Emerald U. Henry
Onyeka Emebob
C. Omonhinmin
ViT
MedIm
22
34
0
18 Nov 2022
Knowledge distillation for fast and accurate DNA sequence correction
Anastasiya Belyaeva
Joel Shor
Daniel E. Cook
Kishwar Shafin
Daniel Liu
...
Alexey Kolesnikov
Cory Y. McLean
Maria Nattestad
Andrew Carroll
Pi-Chuan Chang
11
1
0
17 Nov 2022
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones
Yulin Wang
Yang Yue
Rui Lu
Tian-De Liu
Zhaobai Zhong
S. Song
Gao Huang
32
28
0
17 Nov 2022
On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning
S. Takagi
OffRL
18
7
0
17 Nov 2022
Using Human Perception to Regularize Transfer Learning
Justin Dulay
Walter J. Scheirer
11
8
0
15 Nov 2022
ParCNetV2: Oversized Kernel with Enhanced Attention
Ruihan Xu
Haokui Zhang
Wenze Hu
Shiliang Zhang
Xiaoyu Wang
ViT
25
6
0
14 Nov 2022
Depth and Representation in Vision Models
Benjamin L. Badger
SSL
VLM
FAtt
24
3
0
11 Nov 2022
A Comprehensive Survey of Transformers for Computer Vision
Sonain Jamil
Md. Jalil Piran
Oh-Jin Kwon
ViT
30
46
0
11 Nov 2022
Much Easier Said Than Done: Falsifying the Causal Relevance of Linear Decoding Methods
L. Hayne
Abhijit Suresh
Hunar Jain
Rahul Kumar
R. M. Carter
FAtt
20
1
0
08 Nov 2022
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li
Zedong Wang
Zicheng Liu
Cheng Tan
Haitao Lin
Di Wu
Zhiyuan Chen
Jiangbin Zheng
Stan Z. Li
26
56
0
07 Nov 2022
ViT-CX: Causal Explanation of Vision Transformers
Weiyan Xie
Xiao-hui Li
Caleb Chen Cao
Nevin L.Zhang
ViT
21
17
0
06 Nov 2022
Reliability of CKA as a Similarity Measure in Deep Learning
Mohammad-Javad Davari
Stefan Horoi
A. Natik
Guillaume Lajoie
Guy Wolf
Eugene Belilovsky
AAML
74
36
0
28 Oct 2022
Delving into Masked Autoencoders for Multi-Label Thorax Disease Classification
Junfei Xiao
Yutong Bai
Alan Yuille
Zongwei Zhou
MedIm
ViT
30
58
0
23 Oct 2022
When Can Transformers Ground and Compose: Insights from Compositional Generalization Benchmarks
Ankur Sikarwar
Arkil Patel
Navin Goyal
ViT
23
10
0
23 Oct 2022
The Curious Case of Absolute Position Embeddings
Koustuv Sinha
Amirhossein Kazemnejad
Siva Reddy
J. Pineau
Dieuwke Hupkes
Adina Williams
83
15
0
23 Oct 2022
Boosting vision transformers for image retrieval
Chull Hwan Song
Jooyoung Yoon
Shunghyun Choi
Yannis Avrithis
ViT
24
31
0
21 Oct 2022
Similarity of Neural Architectures using Adversarial Attack Transferability
Jaehui Hwang
Dongyoon Han
Byeongho Heo
Song Park
Sanghyuk Chun
Jong-Seok Lee
AAML
24
1
0
20 Oct 2022
How Does a Deep Learning Model Architecture Impact Its Privacy? A Comprehensive Study of Privacy Attacks on CNNs and Transformers
Guangsheng Zhang
B. Liu
Huan Tian
Tianqing Zhu
Ming Ding
Wanlei Zhou
PILM
MIACV
12
5
0
20 Oct 2022
Scratching Visual Transformer's Back with Uniform Attention
Nam Hyeon-Woo
Kim Yu-Ji
Byeongho Heo
Doonyoon Han
Seong Joon Oh
Tae-Hyun Oh
348
23
0
16 Oct 2022
Vision Transformer Visualization: What Neurons Tell and How Neurons Behave?
Van-Anh Nguyen
Khanh Pham Dinh
L. Vuong
Thanh-Toan Do
Quan Hung Tran
Dinh Q. Phung
Trung Le
ViT
4
2
0
14 Oct 2022
Vision Transformers provably learn spatial structure
Samy Jelassi
Michael E. Sander
Yuan-Fang Li
ViT
MLT
32
73
0
13 Oct 2022
How to Train Vision Transformer on Small-scale Datasets?
Hanan Gani
Muzammal Naseer
Mohammad Yaqub
ViT
12
49
0
13 Oct 2022
Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer
Yanjing Li
Sheng Xu
Baochang Zhang
Xianbin Cao
Penglei Gao
Guodong Guo
MQ
ViT
26
89
0
13 Oct 2022
Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
Zhiying Lu
Hongtao Xie
Chuanbin Liu
Yongdong Zhang
ViT
10
57
0
12 Oct 2022
Previous
1
2
3
4
5
6
7
8
9
Next