Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2108.02347
Cited By
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
5 August 2021
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention"
12 / 12 papers shown
Title
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz U. Abdullaev
Tan M. Nguyen
37
2
0
02 Mar 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
65
5
0
28 Jan 2025
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
19
18
0
09 Feb 2023
Rock Guitar Tablature Generation via Natural Language Processing
Josue Casco-Rodriguez
18
1
0
12 Jan 2023
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
114
36
0
15 Dec 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
21
9
0
01 Aug 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
28
149
0
27 Apr 2022
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
178
0
17 Feb 2021
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
167
3,855
0
14 Dec 2020
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
578
0
12 Mar 2020
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
196
1,363
0
06 Jun 2016
1