ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2108.02347
  4. Cited By
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field
  and Far-field Attention

FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention

5 August 2021
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
ArXivPDFHTML

Papers citing "FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention"

12 / 12 papers shown
Title
Transformer Meets Twicing: Harnessing Unattended Residual Information
Laziz U. Abdullaev
Tan M. Nguyen
37
2
0
02 Mar 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
65
5
0
28 Jan 2025
Efficient Attention via Control Variates
Efficient Attention via Control Variates
Lin Zheng
Jianbo Yuan
Chong-Jun Wang
Lingpeng Kong
19
18
0
09 Feb 2023
Rock Guitar Tablature Generation via Natural Language Processing
Rock Guitar Tablature Generation via Natural Language Processing
Josue Casco-Rodriguez
18
1
0
12 Jan 2023
Efficient Long Sequence Modeling via State Space Augmented Transformer
Efficient Long Sequence Modeling via State Space Augmented Transformer
Simiao Zuo
Xiaodong Liu
Jian Jiao
Denis Xavier Charles
Eren Manavoglu
Tuo Zhao
Jianfeng Gao
114
36
0
15 Dec 2022
Momentum Transformer: Closing the Performance Gap Between Self-attention
  and Its Linearization
Momentum Transformer: Closing the Performance Gap Between Self-attention and Its Linearization
T. Nguyen
Richard G. Baraniuk
Robert M. Kirby
Stanley J. Osher
Bao Wang
21
9
0
01 Aug 2022
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Attention Mechanism in Neural Networks: Where it Comes and Where it Goes
Derya Soydaner
3DV
28
149
0
27 Apr 2022
LambdaNetworks: Modeling Long-Range Interactions Without Attention
LambdaNetworks: Modeling Long-Range Interactions Without Attention
Irwan Bello
260
178
0
17 Feb 2021
Informer: Beyond Efficient Transformer for Long Sequence Time-Series
  Forecasting
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
Haoyi Zhou
Shanghang Zhang
J. Peng
Shuai Zhang
Jianxin Li
Hui Xiong
Wan Zhang
AI4TS
167
3,855
0
14 Dec 2020
Big Bird: Transformers for Longer Sequences
Big Bird: Transformers for Longer Sequences
Manzil Zaheer
Guru Guruganesh
Kumar Avinava Dubey
Joshua Ainslie
Chris Alberti
...
Philip Pham
Anirudh Ravula
Qifan Wang
Li Yang
Amr Ahmed
VLM
249
2,009
0
28 Jul 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
238
578
0
12 Mar 2020
A Decomposable Attention Model for Natural Language Inference
A Decomposable Attention Model for Natural Language Inference
Ankur P. Parikh
Oscar Täckström
Dipanjan Das
Jakob Uszkoreit
196
1,363
0
06 Jun 2016
1