ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.04006
  4. Cited By
Long Range Arena: A Benchmark for Efficient Transformers

Long Range Arena: A Benchmark for Efficient Transformers

8 November 2020
Yi Tay
Mostafa Dehghani
Samira Abnar
Songlin Yang
Dara Bahri
Philip Pham
J. Rao
Liu Yang
Sebastian Ruder
Donald Metzler
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)Github (757★)

Papers citing "Long Range Arena: A Benchmark for Efficient Transformers"

50 / 571 papers shown
DiffuMamba: High-Throughput Diffusion LMs with Mamba Backbone
DiffuMamba: High-Throughput Diffusion LMs with Mamba Backbone
Vaibhav Singh
Oleksiy Ostapenko
Pierre-Andre Noel
Torsten Scholak
Torsten Scholak
MambaAI4CE
500
0
0
19 Nov 2025
Semantic Multiplexing
Semantic Multiplexing
Mohammad Abdi
Francesca Meneghello
Francesco Restuccia
136
0
0
16 Nov 2025
Belief Net: A Filter-Based Framework for Learning Hidden Markov Models from Observations
Belief Net: A Filter-Based Framework for Learning Hidden Markov Models from Observations
Reginald Zhiyan Chen
Heng-Sheng Chang
P. Mehta
113
1
0
13 Nov 2025
BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models
BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models
Chandra Vamsi Krishna Alla
Harish Naidu Gaddam
Manohar Kommi
RALM
341
0
0
07 Nov 2025
EchoLSTM: A Self-Reflective Recurrent Network for Stabilizing Long-Range Memory
EchoLSTM: A Self-Reflective Recurrent Network for Stabilizing Long-Range Memory
Prasanth K K
Shubham Sharma
KELM
170
0
0
03 Nov 2025
Hankel Singular Value Regularization for Highly Compressible State Space Models
Hankel Singular Value Regularization for Highly Compressible State Space Models
Paul Schwerdtner
Jules Berman
Benjamin Peherstorfer
245
2
0
27 Oct 2025
A Deep State-Space Model Compression Method using Upper Bound on Output Error
A Deep State-Space Model Compression Method using Upper Bound on Output Error
Hiroki Sakamoto
Kazuhiro Sato
102
0
0
16 Oct 2025
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy Sparsity
Long Exposure: Accelerating Parameter-Efficient Fine-Tuning for LLMs under Shadowy SparsityInternational Conference for High Performance Computing, Networking, Storage and Analysis (SC), 2024
Tuowei Wang
Kun Li
Zixu Hao
Donglin Bai
Ju Ren
Yaoxue Zhang
Ting Cao
M. Yang
221
5
0
12 Oct 2025
Task-Level Insights from Eigenvalues across Sequence Models
Task-Level Insights from Eigenvalues across Sequence Models
Rahel Rickenbach
Jelena Trisovic
A. Didier
Jerome Sieber
Melanie Zeilinger
127
0
0
10 Oct 2025
Design Principles for Sequence Models via Coefficient Dynamics
Design Principles for Sequence Models via Coefficient Dynamics
Jerome Sieber
Antonio Orvieto
Melanie Zeilinger
Carmen Amo Alonso
156
0
0
10 Oct 2025
Beyond independent component analysis: identifiability and algorithms
Beyond independent component analysis: identifiability and algorithms
Alvaro Ribot
Anna Seigal
Piotr Zwiernik
CML
120
0
0
08 Oct 2025
The End of Transformers? On Challenging Attention and the Rise of Sub-Quadratic Architectures
The End of Transformers? On Challenging Attention and the Rise of Sub-Quadratic Architectures
Alexander Fichtl
Jeremias Bohn
Josefin Kelber
Edoardo Mosca
Georg Groh
173
0
0
06 Oct 2025
RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training
RACE Attention: A Strictly Linear-Time Attention for Long-Sequence Training
Sahil Joshi
Agniva Chowdhury
Amar Kanakamedala
Ekam Singh
Evan Tu
Anshumali Shrivastava
262
1
0
05 Oct 2025
Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention
Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention
Harshil Vejendla
117
0
0
05 Oct 2025
The Curious Case of In-Training Compression of State Space Models
The Curious Case of In-Training Compression of State Space Models
Makram Chahine
Philipp Nazari
Daniela Rus
T. Konstantin Rusch
236
1
0
03 Oct 2025
Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models
Memory Determines Learning Direction: A Theory of Gradient-Based Optimization in State Space Models
JingChuan Guan
T. Kubota
Yasuo Kuniyoshi
Kohei Nakajima
111
0
0
01 Oct 2025
Where to Add PDE Diffusion in Transformers
Where to Add PDE Diffusion in Transformers
Yukun Zhang
Xueqing Zhou
AI4CE
201
0
0
27 Sep 2025
Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models
Aleksandar Terzić
Nicolas Menet
Michael Hersche
Thomas Hofmann
Abbas Rahimi
216
5
0
26 Sep 2025
Aligning Inductive Bias for Data-Efficient Generalization in State Space Models
Aligning Inductive Bias for Data-Efficient Generalization in State Space Models
Qiyu Chen
Guozhang Chen
352
0
0
25 Sep 2025
Myosotis: structured computation for attention like layer
Myosotis: structured computation for attention like layer
Evgenii Egorov
Hanno Ackermann
Markus Nagel
H. Cai
202
1
0
24 Sep 2025
Mamba Modulation: On the Length Generalization of Mamba
Mamba Modulation: On the Length Generalization of Mamba
Peng Lu
Jerry Huang
Qiuhao Zeng
X. Wang
Boxing Wang
Philippe Langlais
Yufei Cui
Mamba
366
0
0
23 Sep 2025
An overview of neural architectures for self-supervised audio representation learning from masked spectrograms
An overview of neural architectures for self-supervised audio representation learning from masked spectrograms
Sarthak Yadav
Sergios Theodoridis
Zheng-Hua Tan
Mamba
277
1
0
23 Sep 2025
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
Daniel Kaiser
Arnoldo Frigessi
Ali Ramezani-Kebrya
Benjamin Ricaud
LRM
191
2
0
22 Sep 2025
Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
Dendritic Resonate-and-Fire Neuron for Effective and Efficient Long Sequence Modeling
Dehao Zhang
Malu Zhang
Shuai Wang
Jingya Wang
Wenjie Wei
Zeyu Ma
Guoqing Wang
Yang Yang
Haizhou Li
266
5
0
21 Sep 2025
Holographic Transformers for Complex-Valued Signal Processing: Integrating Phase Interference into Self-Attention
Holographic Transformers for Complex-Valued Signal Processing: Integrating Phase Interference into Self-Attention
Enhao Huang
Zhiyu Zhang
Tianxiang Xu
Chunshu Xia
Kaichun Hu
Yuchen Yang
Tongtong Pan
Dong Dong
Zhan Qin
166
4
0
14 Sep 2025
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
The Illusion of Diminishing Returns: Measuring Long Horizon Execution in LLMs
Akshit Sinha
Arvindh Arun
Shashwat Goel
Steffen Staab
Jonas Geiping
ALMLRM
348
21
0
11 Sep 2025
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling
Gated Associative Memory: A Parallel O(N) Architecture for Efficient Sequence Modeling
Rishiraj Acharya
59
0
0
30 Aug 2025
Uncovering the Spectral Bias in Diagonal State Space Models
Uncovering the Spectral Bias in Diagonal State Space Models
Rubén Solozabal
Velibor Bojkovic
Hilal AlQuabeh
Kentaro Inui
Martin Takáč
163
2
0
28 Aug 2025
Revisiting associative recall in modern recurrent models
Revisiting associative recall in modern recurrent models
Destiny Okpekpe
Antonio Orvieto
178
0
0
26 Aug 2025
Small transformer architectures for task switching
Small transformer architectures for task switchingInternational Conference on Artificial Neural Networks (ICANN), 2025
Claudius Gros
137
1
0
06 Aug 2025
Systolic Array-based Accelerator for Structured State-Space Models
Systolic Array-based Accelerator for Structured State-Space Models
Shiva Raja
Cansu Demirkiran
Aakash Sarkar
Milos Popovic
A. Joshi
342
0
0
29 Jul 2025
Modality Agnostic Efficient Long Range Encoder
Modality Agnostic Efficient Long Range Encoder
T. Parag
Ahmed Elgammal
192
0
0
25 Jul 2025
SCOPE: Stochastic and Counterbiased Option Placement for Evaluating Large Language Models
SCOPE: Stochastic and Counterbiased Option Placement for Evaluating Large Language Models
Wonjun Jeong
Dongseok Kim
Taegkeun Whangbo
265
2
0
24 Jul 2025
Compression Method for Deep Diagonal State Space Model Based on $H^2$ Optimal Reduction
Compression Method for Deep Diagonal State Space Model Based on H2H^2H2 Optimal ReductionIEEE Control Systems Letters (L-CSS), 2025
Hiroki Sakamoto
Kazuhiro Sato
259
3
0
14 Jul 2025
A Quantile Regression Approach for Remaining Useful Life Estimation with State Space Models
A Quantile Regression Approach for Remaining Useful Life Estimation with State Space Models
Davide Frizzo
Francesco Borsatti
Gian Antonio Susto
172
1
0
20 Jun 2025
From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation
From General to Targeted Rewards: Surpassing GPT-4 in Open-Ended Long-Context Generation
Zhihan Guo
Jiele Wu
Wenqian Cui
Yifei Zhang
Minda Hu
Yufei Wang
Irwin King
ALMLRM
391
3
0
19 Jun 2025
SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
SKOLR: Structured Koopman Operator Linear RNN for Time-Series Forecasting
Yitian Zhang
Liheng Ma
Antonios Valkanas
Boris N. Oreshkin
Mark Coates
AI4TS
351
3
0
17 Jun 2025
A Scalable Hybrid Training Approach for Recurrent Spiking Neural Networks
A Scalable Hybrid Training Approach for Recurrent Spiking Neural Networks
Maximilian Baronig
Yeganeh Bahariasl
Ozan Özdenizci
Robert Legenstein
200
2
0
17 Jun 2025
Scaling Algorithm Distillation for Continuous Control with Mamba
Scaling Algorithm Distillation for Continuous Control with Mamba
Samuel Beaussant
Mehdi Mounsif
244
0
0
16 Jun 2025
Revisiting Transformers with Insights from Image Filtering and Boosting
Revisiting Transformers with Insights from Image Filtering and Boosting
Laziz U. Abdullaev
Maksim Tkachenko
Tan M. Nguyen
ViT
342
1
0
12 Jun 2025
Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs
Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs
Manuel Brenner
G. Koppe
266
0
0
09 Jun 2025
Improving the Efficiency of Long Document Classification using Sentence Ranking Approach
Improving the Efficiency of Long Document Classification using Sentence Ranking Approach
Prathamesh Kokate
Mitali Sarnaik
Manavi Khopade
Raviraj Joshi
237
1
0
08 Jun 2025
Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models
Visual Graph Arena: Evaluating Visual Conceptualization of Vision and Multimodal Large Language Models
Z. Babaiee
Peyman M. Kiasari
Daniela Rus
Radu Grosu
185
1
0
06 Jun 2025
Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions
Numerical Investigation of Sequence Modeling Theory using Controllable Memory Functions
Haotian Jiang
Zeyu Bao
Shida Wang
Qianxiao Li
406
1
0
06 Jun 2025
Context Is Not Comprehension
Context Is Not Comprehension
Alex Pan
Mary-Anne Williams
LRM
434
0
0
05 Jun 2025
SiLIF: Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks
SiLIF: Structured State Space Model Dynamics and Parametrization for Spiking Neural Networks
Maxime Fabre
Lyubov Dudchenko
Emre Neftci
524
8
0
04 Jun 2025
Mamba Drafters for Speculative Decoding
Mamba Drafters for Speculative Decoding
Daewon Choi
Seunghyuk Oh
Saket Dingliwal
Jihoon Tack
Kyuyoung Kim
...
Insu Han
Jinwoo Shin
Aram Galstyan
Shubham Katiyar
S. Bodapati
336
2
0
01 Jun 2025
Weight-Space Linear Recurrent Neural Networks
Weight-Space Linear Recurrent Neural Networks
Roussel Desmond Nzoyem
Nawid Keshtmand
Enrique Crespo Fernandez
Idriss Tsayem
Raúl Santos-Rodríguez
David A.W. Barton
Tom Deakin
385
3
0
01 Jun 2025
Adaptive Two Sided Laplace Transforms: A Learnable, Interpretable, and Scalable Replacement for Self-Attention
Adaptive Two Sided Laplace Transforms: A Learnable, Interpretable, and Scalable Replacement for Self-Attention
Andrew Kiruluta
221
0
0
01 Jun 2025
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
Yiming Lei
Zhizheng Yang
Zeming Liu
Haitao Leng
Shaoguo Liu
Tingting Gao
Qingjie Liu
Yunhong Wang
321
0
0
29 May 2025
1234...101112
Next
Page 1 of 12
Pageof 12