ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2102.03902
  4. Cited By
Nyströmformer: A Nyström-Based Algorithm for Approximating
  Self-Attention
v1v2v3 (latest)

Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention

7 February 2021
Yunyang Xiong
Zhanpeng Zeng
Rudrasis Chakraborty
Mingxing Tan
G. Fung
Yin Li
Vikas Singh
ArXiv (abs)PDFHTMLGithub (376★)

Papers citing "Nyströmformer: A Nyström-Based Algorithm for Approximating Self-Attention"

50 / 146 papers shown
Title
Diagnose Like a Pathologist: Transformer-Enabled Hierarchical
  Attention-Guided Multiple Instance Learning for Whole Slide Image
  Classification
Diagnose Like a Pathologist: Transformer-Enabled Hierarchical Attention-Guided Multiple Instance Learning for Whole Slide Image Classification
Conghao Xiong
Hao Chen
Joseph J. Y. Sung
Irwin King
MedIm
88
38
0
19 Jan 2023
Local Learning on Transformers via Feature Reconstruction
Local Learning on Transformers via Feature Reconstruction
P. Pathak
Jingwei Zhang
Dimitris Samaras
ViT
125
5
0
29 Dec 2022
Multi-Scale Relational Graph Convolutional Network for Multiple Instance
  Learning in Histopathology Images
Multi-Scale Relational Graph Convolutional Network for Multiple Instance Learning in Histopathology Images
Roozbeh Bazargani
L. Fazli
L. Goldenberg
M. Gleave
A. Bashashati
Septimiu Salcudean
MedIm
102
7
0
17 Dec 2022
Efficient Long Sequence Modeling via State Space Augmented Transformer
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
175
37
0
15 Dec 2022
Orthogonal SVD Covariance Conditioning and Latent Disentanglement
Orthogonal SVD Covariance Conditioning and Latent Disentanglement
Yue Song
N. Sebe
Wei Wang
82
6
0
11 Dec 2022
LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from
  Short to Long Contexts and for Implication-Based Retrieval
LawngNLI: A Long-Premise Benchmark for In-Domain Generalization from Short to Long Contexts and for Implication-Based Retrieval
William F. Bruno
Dan Roth
ELMAILaw
51
7
0
06 Dec 2022
Castling-ViT: Compressing Self-Attention via Switching Towards
  Linear-Angular Attention at Vision Transformer Inference
Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference
Haoran You
Yunyang Xiong
Xiaoliang Dai
Bichen Wu
Peizhao Zhang
Haoqi Fan
Peter Vajda
Yingyan Lin
155
34
0
18 Nov 2022
XNOR-FORMER: Learning Accurate Approximations in Long Speech
  Transformers
XNOR-FORMER: Learning Accurate Approximations in Long Speech Transformers
Roshan S. Sharma
Bhiksha Raj
48
3
0
29 Oct 2022
Dense but Efficient VideoQA for Intricate Compositional Reasoning
Dense but Efficient VideoQA for Intricate Compositional Reasoning
Jihyeon Janel Lee
Wooyoung Kang
Eun-Sol Kim
CoGe
51
4
0
19 Oct 2022
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
CAB: Comprehensive Attention Benchmarking on Long Sequence Modeling
Jinchao Zhang
Shuyang Jiang
Jiangtao Feng
Lin Zheng
Dianbo Sui
3DV
197
9
0
14 Oct 2022
Liquid Structural State-Space Models
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
155
107
0
26 Sep 2022
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier
  Layers
Fast-FNet: Accelerating Transformer Encoder Models via Efficient Fourier Layers
Nurullah Sevim
Ege Ozan Özyedek
Furkan Şahinuç
Aykut Koç
85
12
0
26 Sep 2022
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and
  Algorithm Co-design
Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design
Hongxiang Fan
Thomas C. P. Chau
Stylianos I. Venieris
Royson Lee
Alexandros Kouris
Wayne Luk
Nicholas D. Lane
Mohamed S. Abdelfattah
80
62
0
20 Sep 2022
Law Informs Code: A Legal Informatics Approach to Aligning Artificial
  Intelligence with Humans
Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans
John J. Nay
ELMAILaw
190
29
0
14 Sep 2022
On The Computational Complexity of Self-Attention
On The Computational Complexity of Self-Attention
Feyza Duman Keles
Pruthuvi Maheshakya Wijewardena
Chinmay Hegde
142
130
0
11 Sep 2022
A Circular Window-based Cascade Transformer for Online Action Detection
A Circular Window-based Cascade Transformer for Online Action Detection
Shuyuan Cao
Weihua Luo
Bairui Wang
Wei Emma Zhang
Lin Ma
87
6
0
30 Aug 2022
Deep is a Luxury We Don't Have
Deep is a Luxury We Don't Have
Ahmed Taha
Yen Nhi Truong Vu
Brent Mombourquette
Thomas P. Matthews
Jason Su
Sadanand Singh
ViTMedIm
56
2
0
11 Aug 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
127
9
0
01 Aug 2022
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot
  Segmentation
Cost Aggregation with 4D Convolutional Swin Transformer for Few-Shot Segmentation
Sunghwan Hong
Seokju Cho
Jisu Nam
Stephen Lin
Seung Wook Kim
ViT
109
133
0
22 Jul 2022
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Conditional DETR V2: Efficient Detection Transformer with Box Queries
Xiaokang Chen
Fangyun Wei
Gang Zeng
Jingdong Wang
ViT
75
33
0
18 Jul 2022
Consistency of Implicit and Explicit Features Matters for Monocular 3D
  Object Detection
Consistency of Implicit and Explicit Features Matters for Monocular 3D Object Detection
Qian Ye
L. Jiang
Wang Zhen
Yuyang Du
49
6
0
16 Jul 2022
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin
  Memory Model
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Ho Kei Cheng
Alex Schwing
VLMVOS
113
411
0
14 Jul 2022
QSAN: A Near-term Achievable Quantum Self-Attention Network
QSAN: A Near-term Achievable Quantum Self-Attention Network
Jinjing Shi
Ren-Xin Zhao
Wenxuan Wang
Shenmin Zhang
Xuelong Li
107
21
0
14 Jul 2022
Pure Transformers are Powerful Graph Learners
Pure Transformers are Powerful Graph Learners
Jinwoo Kim
Tien Dat Nguyen
Seonwoo Min
Sungjun Cho
Moontae Lee
Honglak Lee
Seunghoon Hong
99
201
0
06 Jul 2022
Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality
Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality
Yue Song
N. Sebe
Wei Wang
88
8
0
05 Jul 2022
Softmax-free Linear Transformers
Softmax-free Linear Transformers
Jiachen Lu
Junge Zhang
Xiatian Zhu
Jianfeng Feng
Tao Xiang
Li Zhang
ViT
54
8
0
05 Jul 2022
SimA: Simple Softmax-free Attention for Vision Transformers
SimA: Simple Softmax-free Attention for Vision Transformers
Soroush Abbasi Koohpayegani
Hamed Pirsiavash
92
26
0
17 Jun 2022
FETILDA: An Effective Framework For Fin-tuned Embeddings For Long
  Financial Text Documents
FETILDA: An Effective Framework For Fin-tuned Embeddings For Long Financial Text Documents
BolunNamirXia
Vipula Rawte
Mohammed J Zaki
Aparna Gupta
AI4TS
18
1
0
14 Jun 2022
BayesFormer: Transformer with Uncertainty Estimation
BayesFormer: Transformer with Uncertainty Estimation
Karthik Abinav Sankararaman
Sinong Wang
Han Fang
UQCVBDL
60
11
0
02 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
418
2,296
0
27 May 2022
X-ViT: High Performance Linear Vision Transformer without Softmax
X-ViT: High Performance Linear Vision Transformer without Softmax
Jeonggeun Song
Heung-Chang Lee
ViT
47
2
0
27 May 2022
Dynamic Query Selection for Fast Visual Perceiver
Dynamic Query Selection for Fast Visual Perceiver
Corentin Dancette
Matthieu Cord
75
1
0
22 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
98
73
0
20 May 2022
A graph-transformer for whole slide image classification
A graph-transformer for whole slide image classification
Yi Zheng
R. Gindra
Emily J. Green
E. Burks
Margrit Betke
J. Beane
V. Kolachalama
MedIm
110
131
0
19 May 2022
FiLM: Frequency improved Legendre Memory Model for Long-term Time Series
  Forecasting
FiLM: Frequency improved Legendre Memory Model for Long-term Time Series Forecasting
Tian Zhou
Ziqing Ma
Xue Wang
Qingsong Wen
Liang Sun
Tao Yao
Wotao Yin
Rong Jin
AI4TS
197
189
0
18 May 2022
Unraveling Attention via Convex Duality: Analysis and Interpretations of
  Vision Transformers
Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers
Arda Sahiner
Tolga Ergen
Batu Mehmet Ozturkler
John M. Pauly
Morteza Mardani
Mert Pilanci
132
33
0
17 May 2022
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision
  Transformers
EdgeViTs: Competing Light-weight CNNs on Mobile Devices with Vision Transformers
Junting Pan
Adrian Bulat
Fuwen Tan
Xiatian Zhu
Łukasz Dudziak
Hongsheng Li
Georgios Tzimiropoulos
Brais Martínez
ViT
94
197
0
06 May 2022
Investigating Neural Architectures by Synthetic Dataset Design
Investigating Neural Architectures by Synthetic Dataset Design
Adrien Courtois
Jean-Michel Morel
Pablo Arias
72
4
0
23 Apr 2022
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better
  than Dot-Product Self-Attention
Paramixer: Parameterizing Mixing Links in Sparse Factors Works Better than Dot-Product Self-Attention
Tong Yu
Ruslan Khalitov
Lei Cheng
Zhirong Yang
MoE
71
11
0
22 Apr 2022
FAR: Fourier Aerial Video Recognition
FAR: Fourier Aerial Video Recognition
D. Kothandaraman
Tianrui Guan
Xijun Wang
Sean Hu
Ming-Shun Lin
Tianyi Zhou
73
13
0
21 Mar 2022
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Dynamic N:M Fine-grained Structured Sparse Attention Mechanism
Zhaodong Chen
Yuying Quan
Zheng Qu
Liu Liu
Yufei Ding
Yuan Xie
85
23
0
28 Feb 2022
cosFormer: Rethinking Softmax in Attention
cosFormer: Rethinking Softmax in Attention
Zhen Qin
Weixuan Sun
Huicai Deng
Dongxu Li
Yunshen Wei
Baohong Lv
Junjie Yan
Lingpeng Kong
Yiran Zhong
97
222
0
17 Feb 2022
ActionFormer: Localizing Moments of Actions with Transformers
ActionFormer: Localizing Moments of Actions with Transformers
Chen-Da Liu-Zhang
Jianxin Wu
Yin Li
ViT
116
342
0
16 Feb 2022
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
CATs++: Boosting Cost Aggregation with Convolutions and Transformers
Seokju Cho
Sunghwan Hong
Seung Wook Kim
ViT
106
40
0
14 Feb 2022
Flowformer: Linearizing Transformers with Conservation Flows
Flowformer: Linearizing Transformers with Conservation Flows
Haixu Wu
Jialong Wu
Jiehui Xu
Jianmin Wang
Mingsheng Long
64
92
0
13 Feb 2022
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Exploiting Spatial Sparsity for Event Cameras with Visual Transformers
Zuowen Wang
Yuhuang Hu
Shih-Chii Liu
ViT
94
33
0
10 Feb 2022
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term
  Series Forecasting
FEDformer: Frequency Enhanced Decomposed Transformer for Long-term Series Forecasting
Tian Zhou
Ziqing Ma
Qingsong Wen
Xue Wang
Liang Sun
Rong Jin
AI4TS
272
1,480
0
30 Jan 2022
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Herbert Ullrich
Jan Drchal
Martin Rýpar
Hana Vincourová
Václav Moravec
HILM
79
9
0
26 Jan 2022
Convolutional Xformers for Vision
Convolutional Xformers for Vision
Pranav Jeevan
Amit Sethi
ViT
91
12
0
25 Jan 2022
Continual Transformers: Redundancy-Free Attention for Online Inference
Continual Transformers: Redundancy-Free Attention for Online Inference
Lukas Hedegaard
Arian Bakhtiarnia
Alexandros Iosifidis
CLL
89
12
0
17 Jan 2022
Previous
123
Next