Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.08810
Cited By
Do Vision Transformers See Like Convolutional Neural Networks?
19 August 2021
M. Raghu
Thomas Unterthiner
Simon Kornblith
Chiyuan Zhang
Alexey Dosovitskiy
ViT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Do Vision Transformers See Like Convolutional Neural Networks?"
50 / 440 papers shown
Title
CellCentroidFormer: Combining Self-attention and Convolution for Cell Detection
Royden Wagner
K. Rohr
ViT
MedIm
20
5
0
01 Jun 2022
What Knowledge Gets Distilled in Knowledge Distillation?
Utkarsh Ojha
Yuheng Li
Anirudh Sundara Rajan
Yingyu Liang
Yong Jae Lee
FedML
21
18
0
31 May 2022
A Closer Look at Self-Supervised Lightweight Vision Transformers
Shaoru Wang
Jin Gao
Zeming Li
Jian-jun Sun
Weiming Hu
ViT
62
41
0
28 May 2022
On the Symmetries of Deep Learning Models and their Internal Representations
Charles Godfrey
Davis Brown
Tegan H. Emerson
Henry Kvinge
20
40
0
27 May 2022
Inception Transformer
Chenyang Si
Weihao Yu
Pan Zhou
Yichen Zhou
Xinchao Wang
Shuicheng Yan
ViT
26
187
0
25 May 2022
SCVRL: Shuffled Contrastive Video Representation Learning
Michael Dorkenwald
Fanyi Xiao
Biagio Brattoli
Joseph Tighe
Davide Modolo
SSL
46
16
0
24 May 2022
A Unified and Biologically-Plausible Relational Graph Representation of Vision Transformers
Yuzhong Chen
Yu Du
Zhe Xiao
Lin Zhao
Lu Zhang
...
Dajiang Zhu
Tuo Zhang
Xintao Hu
Tianming Liu
Xi Jiang
ViT
19
5
0
20 May 2022
Composing General Audio Representation by Fusing Multilayer Features of a Pre-trained Model
Daisuke Niizumi
Daiki Takeuchi
Yasunori Ohishi
N. Harada
K. Kashino
16
5
0
17 May 2022
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
Arda Sahiner
Tolga Ergen
Batu Mehmet Ozturkler
John M. Pauly
Morteza Mardani
Mert Pilanci
24
33
0
17 May 2022
Activating More Pixels in Image Super-Resolution Transformer
Xiangyu Chen
Xintao Wang
Jiantao Zhou
Yu Qiao
Chao Dong
ViT
59
600
0
09 May 2022
Better plain ViT baselines for ImageNet-1k
Lucas Beyer
Xiaohua Zhai
Alexander Kolesnikov
ViT
VLM
24
111
0
03 May 2022
Where in the World is this Image? Transformer-based Geo-localization in the Wild
Shraman Pramanick
E. Nowara
Joshua Gleason
Carlos D. Castillo
Rama Chellappa
ViT
13
30
0
29 Apr 2022
CogView2: Faster and Better Text-to-Image Generation via Hierarchical Transformers
Ming Ding
Wendi Zheng
Wenyi Hong
Jie Tang
VLM
24
321
0
28 Apr 2022
Understanding The Robustness in Vision Transformers
Daquan Zhou
Zhiding Yu
Enze Xie
Chaowei Xiao
Anima Anandkumar
Jiashi Feng
J. Álvarez
ViT
14
185
0
26 Apr 2022
How to Listen? Rethinking Visual Sound Localization
Ho-Hsiang Wu
Magdalena Fuentes
Prem Seetharaman
J. P. Bello
ObjD
22
4
0
11 Apr 2022
Comparison Analysis of Traditional Machine Learning and Deep Learning Techniques for Data and Image Classification
Efstathios Karypidis
Stylianos G. Mouslech
Kassiani Skoulariki
Alexandros Gazis
18
14
0
11 Apr 2022
DualPrompt: Complementary Prompting for Rehearsal-free Continual Learning
Zifeng Wang
Zizhao Zhang
Sayna Ebrahimi
Ruoxi Sun
Han Zhang
...
Xiaoqi Ren
Guolong Su
Vincent Perot
Jennifer Dy
Tomas Pfister
CLL
VLM
VPVLM
28
455
0
10 Apr 2022
UNetFormer: A Unified Vision Transformer Model and Pre-Training Framework for 3D Medical Image Segmentation
Ali Hatamizadeh
Ziyue Xu
Dong Yang
Wenqi Li
H. Roth
Daguang Xu
ViT
MedIm
21
29
0
01 Apr 2022
MAT: Mask-Aware Transformer for Large Hole Image Inpainting
Wenbo Li
Zhe-nan Lin
Kun Zhou
Lu Qi
Yi Wang
Jiaya Jia
25
306
0
29 Mar 2022
GradViT: Gradient Inversion of Vision Transformers
Ali Hatamizadeh
Hongxu Yin
H. Roth
Wenqi Li
Jan Kautz
Daguang Xu
Pavlo Molchanov
ViT
17
63
0
22 Mar 2022
AnoViT: Unsupervised Anomaly Detection and Localization with Vision Transformer-based Encoder-Decoder
Yunseung Lee
Pilsung Kang
ViT
16
73
0
21 Mar 2022
ViM: Out-Of-Distribution with Virtual-logit Matching
Haoqi Wang
Zhizhong Li
Litong Feng
Wayne Zhang
OODD
15
309
0
21 Mar 2022
2-speed network ensemble for efficient classification of incremental land-use/land-cover satellite image chips
M. J. Horry
Subrata Chakraborty
B. Pradhan
N. Shukla
Sanjoy Paul
21
1
0
15 Mar 2022
Unified Visual Transformer Compression
Shixing Yu
Tianlong Chen
Jiayi Shen
Huan Yuan
Jianchao Tan
Sen Yang
Ji Liu
Zhangyang Wang
ViT
14
91
0
15 Mar 2022
Fast Autofocusing using Tiny Transformer Networks for Digital Holographic Microscopy
Stéphane Cuenat
Louis Andréoli
Antoine N. André
P. Sandoz
G. Laurent
R. Couturier
M. Jacquot
24
10
0
15 Mar 2022
CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Yuan Gong
Sameer Khurana
Andrew Rouditchenko
James R. Glass
VLM
25
29
0
13 Mar 2022
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs
Xiaohan Ding
X. Zhang
Yi Zhou
Jungong Han
Guiguang Ding
Jian-jun Sun
VLM
47
528
0
13 Mar 2022
Active Token Mixer
Guoqiang Wei
Zhizheng Zhang
Cuiling Lan
Yan Lu
Zhibo Chen
18
15
0
11 Mar 2022
Visualizing and Understanding Patch Interactions in Vision Transformer
Jie Ma
Yalong Bai
Bineng Zhong
Wei Zhang
Ting Yao
Tao Mei
ViT
8
32
0
11 Mar 2022
Stepwise Feature Fusion: Local Guides Global
Jinfeng Wang
Qiming Huang
Feilong Tang
Jia Meng
Jionglong Su
Sifan Song
ViT
MedIm
19
179
0
07 Mar 2022
What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors
Christos Matsoukas
Johan Fredin Haslum
Moein Sorkhei
Magnus P Soderberg
Kevin Smith
VLM
OOD
MedIm
22
85
0
02 Mar 2022
Auto-scaling Vision Transformers without Training
Wuyang Chen
Wei Huang
Xianzhi Du
Xiaodan Song
Zhangyang Wang
Denny Zhou
ViT
27
23
0
24 Feb 2022
Vision-Language Pre-Training with Triple Contrastive Learning
Jinyu Yang
Jiali Duan
Son N. Tran
Yi Xu
Sampath Chanda
Liqun Chen
Belinda Zeng
Trishul M. Chilimbi
Junzhou Huang
VLM
29
288
0
21 Feb 2022
Hilbert Flattening: a Locality-Preserving Matrix Unfolding Method for Visual Discrimination
Qingsong Zhao
Shuguang Dou
Zhipeng Zhou
Yangguang Li
Yin Wang
Yu Qiao
Cairong Zhao
20
3
0
21 Feb 2022
Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations
Yoshihiro Yamazaki
Shota Orihashi
Ryo Masumura
Mihiro Uchida
Akihiko Takashima
18
8
0
21 Feb 2022
NetSentry: A Deep Learning Approach to Detecting Incipient Large-scale Network Attacks
Haoyu Liu
P. Patras
AAML
8
10
0
20 Feb 2022
How Do Vision Transformers Work?
Namuk Park
Songkuk Kim
ViT
25
465
0
14 Feb 2022
BViT: Broad Attention based Vision Transformer
Nannan Li
Yaran Chen
Weifan Li
Zixiang Ding
Dong Zhao
ViT
30
23
0
13 Feb 2022
Investigating Power laws in Deep Representation Learning
Arna Ghosh
Arnab Kumar Mondal
Kumar Krishna Agrawal
Blake A. Richards
SSL
OOD
11
19
0
11 Feb 2022
How to Understand Masked Autoencoders
Shuhao Cao
Peng-Tao Xu
David A. Clifton
21
40
0
08 Feb 2022
Transformers in Self-Supervised Monocular Depth Estimation with Unknown Camera Intrinsics
Arnav Varma
Hemang Chawla
Bahram Zonooz
Elahe Arani
ViT
MDE
31
49
0
07 Feb 2022
Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers
Amir Ardalan Kalantari
Mohammad Amini
Sarath Chandar
Doina Precup
44
4
0
01 Feb 2022
StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Axel Sauer
Katja Schwarz
Andreas Geiger
182
489
0
01 Feb 2022
Deconfounded Representation Similarity for Comparison of Neural Networks
Tianyu Cui
Yogesh Kumar
Pekka Marttinen
Samuel Kaski
CML
24
13
0
31 Jan 2022
Nearest Class-Center Simplification through Intermediate Layers
Ido Ben-Shaul
S. Dekel
38
26
0
21 Jan 2022
Attention-based Proposals Refinement for 3D Object Detection
Minh-Quan Dao
Elwan Héry
Vincent Frémont
3DPC
16
2
0
18 Jan 2022
VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun
Haoyu Ma
Guoliang Kang
Yifan Jiang
Tianlong Chen
Xiaolong Ma
Zhangyang Wang
Yanzhi Wang
ViT
25
45
0
17 Jan 2022
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images
Ali Hatamizadeh
V. Nath
Yucheng Tang
Dong Yang
H. Roth
Daguang Xu
ViT
MedIm
17
1,054
0
04 Jan 2022
Modeling Mask Uncertainty in Hyperspectral Image Reconstruction
Jiamian Wang
Yulun Zhang
X. Yuan
Ziyi Meng
Zhiqiang Tao
11
9
0
31 Dec 2021
CSformer: Bridging Convolution and Transformer for Compressive Sensing
Dongjie Ye
Zhangkai Ni
Hanli Wang
Jian Andrew Zhang
Shiqi Wang
Sam Kwong
ViT
MedIm
21
51
0
31 Dec 2021
Previous
1
2
3
4
5
6
7
8
9
Next