ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.03902
  4. Cited By
Nyströmformer: A Nyström-Based Algorithm for Approximating
  Self-Attention
v1v2v3 (latest)

Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

7 February 2021
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
ArXiv (abs)PDFHTMLGithub (376★)

Papers citing "Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention"

45 / 145 papers shown
Title
Learning Operators with Coupled Attention
Learning Operators with Coupled Attention
Georgios Kissas
Jacob H. Seidman
Leonardo Ferreira Guilhoto
V. Preciado
George J. Pappas
P. Perdikaris
86
113
0
04 Jan 2022
Which Student is Best? A Comprehensive Knowledge Distillation Exam for
  Task-Specific BERT Models
Which Student is Best? A Comprehensive Knowledge Distillation Exam for Task-Specific BERT Models
Made Nindyatama Nityasya
Haryo Akbarianto Wibowo
Rendi Chevi
Radityo Eko Prasojo
Alham Fikri Aji
76
6
0
03 Jan 2022
Multi-Dimensional Model Compression of Vision Transformer
Multi-Dimensional Model Compression of Vision Transformer
Zejiang Hou
S. Kung
ViT
66
18
0
31 Dec 2021
Simple Local Attentions Remain Competitive for Long-Context Tasks
Simple Local Attentions Remain Competitive for Long-Context Tasks
Wenhan Xiong
Barlas Ouguz
Anchit Gupta
Xilun Chen
Diana Liskovich
Omer Levy
Wen-tau Yih
Yashar Mehdad
97
29
0
14 Dec 2021
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Couplformer:Rethinking Vision Transformer with Coupling Attention Map
Hai Lan
Xihao Wang
Xian Wei
ViT
89
3
0
10 Dec 2021
Sketching as a Tool for Understanding and Accelerating Self-attention
  for Long Sequences
Sketching as a Tool for Understanding and Accelerating Self-attention for Long Sequences
Yifan Chen
Qi Zeng
Dilek Z. Hakkani-Tür
Di Jin
Heng Ji
Yun Yang
83
5
0
10 Dec 2021
Forward Operator Estimation in Generative Models with Kernel Transfer
  Operators
Forward Operator Estimation in Generative Models with Kernel Transfer Operators
Z. Huang
Rudrasis Chakraborty
Vikas Singh
GAN
50
3
0
01 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural
  Network Models
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao Song
Atri Rudra
Christopher Ré
133
79
0
30 Nov 2021
Adaptive Fourier Neural Operators: Efficient Token Mixers for
  Transformers
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers
John Guibas
Morteza Mardani
Zong-Yi Li
Andrew Tao
Anima Anandkumar
Bryan Catanzaro
113
246
0
24 Nov 2021
SOFT: Softmax-free Transformer with Linear Complexity
SOFT: Softmax-free Transformer with Linear Complexity
Jiachen Lu
Jinghan Yao
Junge Zhang
Martin Danelljan
Hang Xu
Weiguo Gao
Chunjing Xu
Thomas B. Schon
Li Zhang
78
166
0
22 Oct 2021
Improving Transformers with Probabilistic Attention Keys
Improving Transformers with Probabilistic Attention Keys
Tam Nguyen
T. Nguyen
Dung D. Le
Duy Khuong Nguyen
Viet-Anh Tran
Richard G. Baraniuk
Nhat Ho
Stanley J. Osher
129
33
0
16 Oct 2021
On Learning the Transformer Kernel
On Learning the Transformer Kernel
Sankalan Pal Chowdhury
Adamos Solomou
Kumar Avinava Dubey
Mrinmaya Sachan
ViT
131
14
0
15 Oct 2021
The Neural Data Router: Adaptive Control Flow in Transformers Improves
  Systematic Generalization
The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
AI4CE
115
57
0
14 Oct 2021
Geometric Transformers for Protein Interface Contact Prediction
Geometric Transformers for Protein Interface Contact Prediction
Alex Morehead
Chen Chen
Jianlin Cheng
136
29
0
06 Oct 2021
UFO-ViT: High Performance Linear Vision Transformer without Softmax
UFO-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
ViT
175
21
0
29 Sep 2021
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
Long-Range Transformers for Dynamic Spatiotemporal Forecasting
J. E. Grigsby
Zhe Wang
Nam Nguyen
Yanjun Qi
AI4TS
121
95
0
24 Sep 2021
Sparse Factorization of Large Square Matrices
Sparse Factorization of Large Square Matrices
Ruslan Khalitov
Tong Yu
Lei Cheng
Zhirong Yang
26
2
0
16 Sep 2021
The Sensory Neuron as a Transformer: Permutation-Invariant Neural
  Networks for Reinforcement Learning
The Sensory Neuron as a Transformer: Permutation-Invariant Neural Networks for Reinforcement Learning
Yujin Tang
David R Ha
108
77
0
07 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
Peng-Jen Chen
93
21
0
06 Sep 2021
Neural TMDlayer: Modeling Instantaneous flow of features via SDE
  Generators
Neural TMDlayer: Modeling Instantaneous flow of features via SDE Generators
Zihang Meng
Vikas Singh
Sathya Ravi
49
1
0
19 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field
  and Far-field Attention
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
72
36
0
05 Aug 2021
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Perceiver IO: A General Architecture for Structured Inputs & Outputs
Andrew Jaegle
Sebastian Borgeaud
Jean-Baptiste Alayrac
Carl Doersch
Catalin Ionescu
...
Olivier J. Hénaff
M. Botvinick
Andrew Zisserman
Oriol Vinyals
João Carreira
MLLMVLMGNN
148
585
0
30 Jul 2021
PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation
PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation
Francesco Foscarin
Nicolas Audebert
Raphaël Fournier-S’niehotta
31
11
0
27 Jul 2021
From block-Toeplitz matrices to differential equations on graphs:
  towards a general theory for scalable masked Transformers
From block-Toeplitz matrices to differential equations on graphs: towards a general theory for scalable masked Transformers
K. Choromanski
Han Lin
Haoxian Chen
Tianyi Zhang
Arijit Sehanobish
Valerii Likhosherstov
Jack Parker-Holder
Tamás Sarlós
Adrian Weller
Thomas Weingarten
115
34
0
16 Jul 2021
Grid Partitioned Attention: Efficient TransformerApproximation with
  Inductive Bias for High Resolution Detail Generation
Grid Partitioned Attention: Efficient TransformerApproximation with Inductive Bias for High Resolution Detail Generation
Nikolay Jetchev
Gökhan Yildirim
Christian Bracher
Roland Vollgraf
26
0
0
08 Jul 2021
Vision Xformers: Efficient Attention for Image Classification
Vision Xformers: Efficient Attention for Image Classification
Pranav Jeevan
Amit Sethi
ViT
48
13
0
05 Jul 2021
Long-Short Transformer: Efficient Transformers for Language and Vision
Long-Short Transformer: Efficient Transformers for Language and Vision
Chen Zhu
Ming-Yu Liu
Chaowei Xiao
Mohammad Shoeybi
Tom Goldstein
Anima Anandkumar
Bryan Catanzaro
ViTVLM
119
133
0
05 Jul 2021
Closed-form Continuous-time Neural Models
Closed-form Continuous-time Neural Models
Ramin Hasani
Mathias Lechner
Alexander Amini
Lucas Liebenwein
Aaron Ray
Max Tschaikowski
G. Teschl
Daniela Rus
PINNAI4TS
104
91
0
25 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional
  Encoding
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
95
50
0
23 Jun 2021
XCiT: Cross-Covariance Image Transformers
XCiT: Cross-Covariance Image Transformers
Alaaeldin El-Nouby
Hugo Touvron
Mathilde Caron
Piotr Bojanowski
Matthijs Douze
...
Ivan Laptev
Natalia Neverova
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
ViT
151
517
0
17 Jun 2021
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers
Mandela Patrick
Dylan Campbell
Yuki M. Asano
Ishan Misra
Ishan Misra Florian Metze
Christoph Feichtenhofer
Andrea Vedaldi
João F. Henriques
114
282
0
09 Jun 2021
Densely connected normalizing flows
Densely connected normalizing flows
Matej Grcić
Ivan Grubišić
Sinisa Segvic
TPM
97
59
0
08 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
202
1,148
0
08 Jun 2021
Detect the Interactions that Matter in Matter: Geometric Attention for
  Many-Body Systems
Detect the Interactions that Matter in Matter: Geometric Attention for Many-Body Systems
Thorben Frank
Stefan Chmiela
51
3
0
04 Jun 2021
TransMIL: Transformer based Correlated Multiple Instance Learning for
  Whole Slide Image Classification
TransMIL: Transformer based Correlated Multiple Instance Learning for Whole Slide Image Classification
Zhucheng Shao
Hao Bian
Yang Chen
Yifeng Wang
Jian Zhang
Xiangyang Ji
Yongbing Zhang
ViTMedIm
138
689
0
02 Jun 2021
Choose a Transformer: Fourier or Galerkin
Choose a Transformer: Fourier or Galerkin
Shuhao Cao
90
256
0
31 May 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
137
104
0
26 Mar 2021
U-Net Transformer: Self and Cross Attention for Medical Image
  Segmentation
U-Net Transformer: Self and Cross Attention for Medical Image Segmentation
Olivier Petit
Nicolas Thome
Clément Rambour
L. Soler
ViTMedIm
99
250
0
10 Mar 2021
Beyond Nyströmformer -- Approximation of self-attention by Spectral
  Shifting
Beyond Nyströmformer -- Approximation of self-attention by Spectral Shifting
Madhusudan Verma
48
1
0
09 Mar 2021
Perceiver: General Perception with Iterative Attention
Perceiver: General Perception with Iterative Attention
Andrew Jaegle
Felix Gimeno
Andrew Brock
Andrew Zisserman
Oriol Vinyals
João Carreira
VLMViTMDE
214
1,029
0
04 Mar 2021
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
359
181
0
17 Feb 2021
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can
  Scale Up
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up
Yi Ding
Shiyu Chang
Zhangyang Wang
ViT
154
393
0
14 Feb 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
387
2,567
0
04 Jan 2021
Point Transformer
Point Transformer
Nico Engel
Vasileios Belagiannis
Klaus C. J. Dietmayer
3DPC
190
2,023
0
02 Nov 2020
Efficient Transformers: A Survey
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
230
1,136
0
14 Sep 2020
Previous
123