Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.03902
Cited By
v1
v2
v3 (latest)
Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention
7 February 2021
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
Re-assign community
ArXiv (abs)
PDF
HTML
Github (376★)
Papers citing
"Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention"
45 / 145 papers shown
Title
Learning Operators with Coupled Attention
Georgios Kissas
Jacob H. Seidman
Leonardo Ferreira Guilhoto
V. Preciado
George J. Pappas
P. Perdikaris
86
113
0
04 Jan 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
76
6
0
03 Jan 2022
Multi-Dimensional Model Compression of Vision Transformer
Zejiang Hou
S. Kung
ViT
66
18
0
31 Dec 2021
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong
Barlas Ouguz
Anchit Gupta
Xilun Chen
Diana Liskovich
Omer Levy
Wen-tau Yih
Yashar Mehdad
97
29
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
89
3
0
10 Dec 2021
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences
Yifan Chen
Qi Zeng
Dilek Z. Hakkani-Tür
Di Jin
Heng Ji
Yun Yang
83
5
0
10 Dec 2021
Forward Operator Estimation in Generative Models with Kernel Transfer Operators
Z. Huang
Rudrasis Chakraborty
Vikas Singh
GAN
50
3
0
01 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao Song
Atri Rudra
Christopher Ré
133
79
0
30 Nov 2021
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers
John Guibas
Morteza Mardani
Zong-Yi Li
Andrew Tao
Anima Anandkumar
Bryan Catanzaro
113
246
0
24 Nov 2021
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
78
166
0
22 Oct 2021
Improving Transformers with Probabilistic Attention Keys
Tam Nguyen
T. Nguyen
Dung D. Le
Duy Khuong Nguyen
Viet-Anh Tran
Richard G. Baraniuk
Nhat Ho
Stanley J. Osher
129
33
0
16 Oct 2021
On Learning the Transformer Kernel
Sankalan Pal Chowdhury
Adamos Solomou
Kumar Avinava Dubey
Mrinmaya Sachan
ViT
131
14
0
15 Oct 2021
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
AI4CE
115
57
0
14 Oct 2021
Geometric Transformers for Protein Interface Contact Prediction
Alex Morehead
Chen Chen
Jianlin Cheng
136
29
0
06 Oct 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
175
21
0
29 Sep 2021
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
J. E. Grigsby
Zhe Wang
Nam Nguyen
Yanjun Qi
AI4TS
121
95
0
24 Sep 2021
Sparse Factorization of Large Square Matrices
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
26
2
0
16 Sep 2021
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning
Yujin Tang
David R Ha
108
77
0
07 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
Peng-Jen Chen
93
21
0
06 Sep 2021
Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators
Zihang Meng
Vikas Singh
Sathya Ravi
49
1
0
19 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
72
36
0
05 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLM
VLM
GNN
148
585
0
30 Jul 2021
PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation
Francesco Foscarin
Nicolas Audebert
Raphaël Fournier-S’niehotta
31
11
0
27 Jul 2021
From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers
K. Choromanski
Han Lin
Haoxian Chen
Tianyi Zhang
Arijit Sehanobish
Valerii Likhosherstov
Jack Parker-Holder
Tamás Sarlós
Adrian Weller
Thomas Weingarten
115
34
0
16 Jul 2021
Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation
Nikolay Jetchev
Gökhan Yildirim
Christian Bracher
Roland Vollgraf
26
0
0
08 Jul 2021
Vision Xformers: Efficient Attention for Image Classification
Pranav Jeevan
Amit Sethi
ViT
48
13
0
05 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Ming-Yu Liu
Chaowei Xiao
Mohammad Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViT
VLM
119
133
0
05 Jul 2021
Closed-form Continuous-time Neural Models
Ramin Hasani
Mathias Lechner
Alexander Amini
Lucas Liebenwein
Aaron Ray
Max Tschaikowski
G. Teschl
Daniela Rus
PINN
AI4TS
104
91
0
25 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
95
50
0
23 Jun 2021
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
151
517
0
17 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
114
282
0
09 Jun 2021
Densely connected normalizing flows
Matej Grcić
Ivan Grubišić
Sinisa Segvic
TPM
97
59
0
08 Jun 2021
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
202
1,148
0
08 Jun 2021
Detect the Interactions that Matter in Matter: Geometric Attention for Many-Body Systems
Thorben Frank
Stefan Chmiela
51
3
0
04 Jun 2021
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification
Zhucheng Shao
Hao Bian
Yang Chen
Yifeng Wang
Jian Zhang
Xiangyang Ji
Yongbing Zhang
ViT
MedIm
138
689
0
02 Jun 2021
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
90
256
0
31 May 2021
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
137
104
0
26 Mar 2021
U-Net Transformer: Self and Cross Attention for Medical Image Segmentation
Olivier Petit
Nicolas Thome
Clément Rambour
L. Soler
ViT
MedIm
99
250
0
10 Mar 2021
Beyond Nyströmformer -- Approximation of self-attention by Spectral Shifting
Madhusudan Verma
48
1
0
09 Mar 2021
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLM
ViT
MDE
214
1,029
0
04 Mar 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
359
181
0
17 Feb 2021
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
Yi Ding
Shiyu Chang
Zhangyang Wang
ViT
154
393
0
14 Feb 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
387
2,567
0
04 Jan 2021
Point Transformer
Nico Engel
Vasileios Belagiannis
Klaus C. J. Dietmayer
3DPC
190
2,023
0
02 Nov 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
230
1,136
0
14 Sep 2020
Previous
1
2
3