Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2103.03404
Cited By
Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth
5 March 2021
Yihe Dong
Jean-Baptiste Cordonnier
Andreas Loukas
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth"
38 / 238 papers shown
Title
The self-supervised spectral-spatial attention-based transformer network for automated, accurate prediction of crop nitrogen status from UAV imagery
Xin Zhang
Liangxiu Han
Tam Sobeih
Lewis Lappin
Mark A. Lee
Andew Howard
A. Kisdi
ViT
12
1
0
12 Nov 2021
A Survey of Visual Transformers
Yang Liu
Yao Zhang
Yixin Wang
Feng Hou
Jin Yuan
Jiang Tian
Yang Zhang
Zhongchao Shi
Jianping Fan
Zhiqiang He
3DGS
ViT
69
330
0
11 Nov 2021
Can Vision Transformers Perform Convolution?
Shanda Li
Xiangning Chen
Di He
Cho-Jui Hsieh
ViT
19
19
0
02 Nov 2021
Skyformer: Remodel Self-Attention with Gaussian Kernel and Nyström Method
Yifan Chen
Qi Zeng
Heng Ji
Yun Yang
10
49
0
29 Oct 2021
Sinkformers: Transformers with Doubly Stochastic Attention
Michael E. Sander
Pierre Ablin
Mathieu Blondel
Gabriel Peyré
18
76
0
22 Oct 2021
An Empirical Study: Extensive Deep Temporal Point Process
Haitao Lin
Cheng Tan
Lirong Wu
Zhangyang Gao
Stan. Z. Li
AI4TS
13
12
0
19 Oct 2021
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
Benyou Wang
Qianqian Xie
Jiahuan Pei
Zhihong Chen
Prayag Tiwari
Zhao Li
Jie Fu
LM&MA
AI4CE
31
163
0
11 Oct 2021
Abstraction, Reasoning and Deep Learning: A Study of the "Look and Say" Sequence
Wlodek Zadrozny
11
1
0
27 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
16
28
0
15 Sep 2021
Incorporating Residual and Normalization Layers into Analysis of Masked Language Models
Goro Kobayashi
Tatsuki Kuribayashi
Sho Yokoi
Kentaro Inui
158
46
0
15 Sep 2021
Is Attention Better Than Matrix Decomposition?
Zhengyang Geng
Meng-Hao Guo
Hongxu Chen
Xia Li
Ke Wei
Zhouchen Lin
54
137
0
09 Sep 2021
TCCT: Tightly-Coupled Convolutional Transformer on Time Series Forecasting
Li Shen
Yangzhu Wang
AI4TS
12
92
0
29 Aug 2021
On the Effect of Pruning on Adversarial Robustness
Artur Jordão
Hélio Pedrini
AAML
27
22
0
10 Aug 2021
Unsupervised Discovery of Object Radiance Fields
Hong-Xing Yu
Leonidas J. Guibas
Jiajun Wu
OCL
11
121
0
16 Jul 2021
AutoBERT-Zero: Evolving BERT Backbone from Scratch
Jiahui Gao
Hang Xu
Han Shi
Xiaozhe Ren
Philip L. H. Yu
Xiaodan Liang
Xin Jiang
Zhenguo Li
13
37
0
15 Jul 2021
Locally Enhanced Self-Attention: Combining Self-Attention and Convolution as Local and Context Terms
Chenglin Yang
Siyuan Qiao
Adam Kortylewski
Alan Yuille
12
4
0
12 Jul 2021
ViTGAN: Training GANs with Vision Transformers
Kwonjoon Lee
Huiwen Chang
Lu Jiang
Han Zhang
Z. Tu
Ce Liu
ViT
16
183
0
09 Jul 2021
SpectralFormer: Rethinking Hyperspectral Image Classification with Transformers
Danfeng Hong
Zhu Han
Jing Yao
Lianru Gao
Bing Zhang
Antonio J. Plaza
Jocelyn Chanussot
ViT
26
864
0
07 Jul 2021
Augmented Shortcuts for Vision Transformers
Yehui Tang
Kai Han
Chang Xu
An Xiao
Yiping Deng
Chao Xu
Yunhe Wang
ViT
12
39
0
30 Jun 2021
P2T: Pyramid Pooling Transformer for Scene Understanding
Yu-Huan Wu
Yun-Hai Liu
Xin Zhan
Mingg-Ming Cheng
ViT
24
218
0
22 Jun 2021
GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures
Ivan Chelombiev
Daniel Justus
Douglas Orr
A. Dietrich
Frithjof Gressmann
A. Koliousis
Carlo Luschi
13
5
0
10 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
27
1,084
0
08 Jun 2021
What training reveals about neural network complexity
Andreas Loukas
Marinos Poiitis
Stefanie Jegelka
12
11
0
08 Jun 2021
Generative Flows with Invertible Attentions
R. Sukthanker
Zhiwu Huang
Suryansh Kumar
Radu Timofte
Luc Van Gool
11
14
0
07 Jun 2021
On the Expressive Power of Self-Attention Matrices
Valerii Likhosherstov
K. Choromanski
Adrian Weller
30
33
0
07 Jun 2021
Refiner: Refining Self-attention for Vision Transformers
Daquan Zhou
Yujun Shi
Bingyi Kang
Weihao Yu
Zihang Jiang
Yuan Li
Xiaojie Jin
Qibin Hou
Jiashi Feng
ViT
12
59
0
07 Jun 2021
Analogous to Evolutionary Algorithm: Designing a Unified Sequence Model
Jiangning Zhang
Chao Xu
Jian Li
Wenzhou Chen
Yabiao Wang
Ying Tai
Shuo Chen
Chengjie Wang
Feiyue Huang
Yong Liu
27
22
0
31 May 2021
Knowledge Neurons in Pretrained Transformers
Damai Dai
Li Dong
Y. Hao
Zhifang Sui
Baobao Chang
Furu Wei
KELM
MU
6
414
0
18 Apr 2021
Higher Order Recurrent Space-Time Transformer for Video Action Prediction
Tsung-Ming Tai
G. Fiameni
Cheng-Kuang Lee
O. Lanz
22
9
0
17 Apr 2021
Not All Attention Is All You Need
Hongqiu Wu
Hai Zhao
Min Zhang
6
9
0
10 Apr 2021
Understanding Robustness of Transformers for Image Classification
Srinadh Bhojanapalli
Ayan Chakrabarti
Daniel Glasner
Daliang Li
Thomas Unterthiner
Andreas Veit
ViT
14
377
0
26 Mar 2021
Multi-view 3D Reconstruction with Transformer
Dan Wang
Xinrui Cui
Xun Chen
Zhengxia Zou
Tianyang Shi
Septimiu Salcudean
Z. J. Wang
Rabab Ward
ViT
17
86
0
24 Mar 2021
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases
Stéphane dÁscoli
Hugo Touvron
Matthew L. Leavitt
Ari S. Morcos
Giulio Biroli
Levent Sagun
ViT
27
803
0
19 Mar 2021
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Tao Lei
RALM
VLM
45
47
0
24 Feb 2021
Influence Patterns for Explaining Information Flow in BERT
Kaiji Lu
Zifan Wang
Piotr (Peter) Mardziel
Anupam Datta
GNN
14
16
0
02 Nov 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
251
2,012
0
28 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,950
0
20 Apr 2018
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
196
1,367
0
06 Jun 2016
Previous
1
2
3
4
5