v1v2 (latest)

Parallelizing Linear Recurrent Neural Nets Over Sequence Length

12 September 2017

Eric Martin

Chris Cundy

ArXiv (abs)PDF HTML

Papers citing "Parallelizing Linear Recurrent Neural Nets Over Sequence Length"

50 / 52 papers shown

Selective Rotary Position Embedding

291

21 Nov 2025

Misaligned by Design: Incentive Failures in Machine Learning

109

10 Nov 2025

MossNet: Mixture of State-Space Experts is a Multi-Head Attention

269

30 Oct 2025

TempoPFN: Synthetic Pre-training of Linear RNNs for Zero-shot Time Series Forecasting

211

29 Oct 2025

SHAP Meets Tensor Networks: Provably Tractable Explanations with Parallelism

260

24 Oct 2025

Similarity-Aware Selective State-Space Modeling for Semantic Correspondence

Seungwook Kim

Minsu Cho

Mamba

200

29 Sep 2025

Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models

164

26 Sep 2025

A Unifying Framework for Parallelizing Sequential Models with Linear Dynamical Systems

111

26 Sep 2025

Elucidating the Design Space of Decay in Linear Attention

Zhen Qin

Xuyang Shen

Yiran Zhong

100

05 Sep 2025

Revisiting associative recall in modern recurrent models

Destiny Okpekpe

Antonio Orvieto

124

26 Aug 2025

Predictability Enables Parallelization of Nonlinear State Space Models

194

22 Aug 2025

Fast weight programming and linear transformers: from machine learning to neurobiology

Kazuki Irie

Samuel J. Gershman

170

11 Aug 2025

Prototype-Driven Structure Synergy Network for Remote Sensing Images SegmentationIEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS), 2025

139

06 Aug 2025

Knowing When to Quit: Probabilistic Early Exits for Speech Separation

Rasmus Malik Høegh Lindrup

Bjørn Sand Jensen

Morten Mørup

UQCV

244

13 Jul 2025

RAT: Bridging RNN Efficiency and Attention Accuracy via Chunk-based Sequence ModelingIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2025

247

06 Jul 2025

Sequential-Parallel Duality in Prefix Scannable Models

371

12 Jun 2025

Uncovering the Computational Roles of Nonlinearity in Sequence Modeling Using Almost-Linear RNNs

Manuel Brenner

G. Koppe

200

09 Jun 2025

How Does Sequence Modeling Architecture Influence Base Capabilities of Pre-trained Language Models? Exploring Key Architecture Design Principles to Avoid Base Capabilities Degradation

215

24 May 2025

Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models

366

23 May 2025

Learning to Dissipate Energy in Oscillatory State-Space Models

Jared Boyer

T. Konstantin Rusch

Daniela Rus

276

17 May 2025

Hardware-aligned Hierarchical Sparse Attention for Efficient Long-term Memory Access

432

23 Apr 2025

DiffVox: A Differentiable Model for Capturing and Analysing Vocal Effects Distributions

Chin-Yun Yu

Marco A. Martínez-Ramírez

265

20 Apr 2025

Bidirectional Linear Recurrent Models for Sequence-Level Multisource Fusion

240

11 Apr 2025

Resona: Improving Context Copying in Linear Recurrence Models with Retrieval

Prasanna Parthasarathi

432

28 Mar 2025

Fixed-Point RNNs: Interpolating from Diagonal to Dense

427

13 Mar 2025

Towards Scalable and Stable Parallelization of Nonlinear RNNsNeural Information Processing Systems (NeurIPS), 2024

598

17 Jan 2025

VMamba: Visual State Space ModelNeural Information Processing Systems (NeurIPS), 2024

1.1K

1,522

31 Dec 2024

Multi-Agent Reinforcement Learning with Selective State-Space ModelsAdaptive Agents and Multi-Agent Systems (AAMAS), 2024

315

25 Oct 2024

Oscillatory State-Space ModelsInternational Conference on Learning Representations (ICLR), 2024

T. Konstantin Rusch

Daniela Rus

AI4TS

915

04 Oct 2024

Real-Time Recurrent Learning using Trace Units in Reinforcement LearningNeural Information Processing Systems (NeurIPS), 2024

Martha White

350

02 Sep 2024

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Yadong Lu

Weizhu Chen

368

111

11 Jun 2024

LongSSM: On the Length Extension of State-space Models in Language Modelling

Shida Wang

248

04 Jun 2024

Mamba-R: Vision Mamba ALSO Needs RegistersComputer Vision and Pattern Recognition (CVPR), 2024

336

23 May 2024

Does Transformer Interpretability Transfer to RNNs?

Gonccalo Paulo

Thomas Marshall

Nora Belrose

199

09 Apr 2024

Softmax Attention with Constant Cost per Token

Franz A. Heinsen

144

08 Apr 2024

Linear Attention Sequence Parallelism

378

03 Apr 2024

Theoretical Foundations of Deep Selective State-Space Models

Antonio Orvieto

624

29 Feb 2024

Investigating Recurrent Transformers with Dynamic Halt

Jishnu Ray Chowdhury

Cornelia Caragea

535

01 Feb 2024

Gated Linear Attention Transformers with Hardware-Efficient Training

Bailin Wang

443

300

11 Dec 2023

Hierarchically Gated Recurrent Neural Network for Sequence Modeling

Zhen Qin

Aaron Courville

Yiran Zhong

196

117

08 Nov 2023

RWKV: Reinventing RNNs for the Transformer EraConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

...

Rui-Jie Zhu

578

845

22 May 2023

Transformer Working Memory Enables Regular Language Reasoning and Natural Language Length ExtrapolationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Ta-Chung Chi

Ting-Han Fan

Alexander I. Rudnicky

Peter J. Ramadge

LRM

147

05 May 2023

Parallel Spiking Neurons with High Efficiency and Ability to Learn Long-term DependenciesNeural Information Processing Systems (NeurIPS), 2023

Yonghong Tian

336

25 Apr 2023

Resurrecting Recurrent Neural Networks for Long SequencesInternational Conference on Machine Learning (ICML), 2023

Antonio Orvieto

497

418

11 Mar 2023

Parallelizing Legendre Memory Unit TrainingInternational Conference on Machine Learning (ICML), 2021

Narsimha Chilkuri

C. Eliasmith

203

22 Feb 2021

Sub-Linear Memory: How to Make Performers SLiMNeural Information Processing Systems (NeurIPS), 2020

Valerii Likhosherstov

231

21 Dec 2020

Learning to Reconstruct and Segment 3D Objects

Bo Yang

3DPC

207

19 Oct 2020

Learning Efficient Representations of Mouse Movements to Predict User AttentionAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2020

Ioannis Arapakis

Luis A. Leiva

HAI

149

30 May 2020

Tensor Networks for Probabilistic Sequence Modeling

Jacob Miller

Guillaume Rabusseau

John Terilla

299

02 Mar 2020

Equilibrated Recurrent Neural Network: Neuronal Time-Delayed Self-Feedback Improves Accuracy and Stability

113

02 Mar 2019