The Illusion of State in State-Space Models

12 April 2024

Papers citing "The Illusion of State in State-Space Models"

32 / 32 papers shown

Title
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization Ali Behrouz Meisam Razaviyayn Peilin Zhong Vahab Mirrokni 31 0 0 17 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention Mattia Opper Roland Fernandez P. Smolensky Jianfeng Gao 37 0 0 29 Mar 2025
Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations Sajad Movahedi Felix Sarnthein Nicola Muca Cirone Antonio Orvieto 46 2 0 13 Mar 2025
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers William Merrill Ashish Sabharwal 42 4 0 05 Mar 2025
(How) Do Language Models Track State? Belinda Z. Li Zifan Carl Guo Jacob Andreas LRM 44 0 0 04 Mar 2025
Compositional Reasoning with Transformers, RNNs, and Chain of Thought Gilad Yehudai Noah Amsel Joan Bruna LRM 50 1 0 03 Mar 2025
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking Yifan Zhang Wenyu Du Dongming Jin Jie Fu Zhi Jin LRM 46 0 0 27 Feb 2025
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks Thomas Schmied Thomas Adler Vihang Patil M. Beck Korbinian Poppel Johannes Brandstetter G. Klambauer Razvan Pascanu Sepp Hochreiter 55 4 0 21 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers Alireza Amiri Xinting Huang Mark Rofin Michael Hahn LRM 78 0 0 04 Feb 2025
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer Lin Yueyu Li Zhiyuan Peter Yue Liu Xiao 34 5 0 28 Jan 2025
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling Sai Tarun Inaganti Gennady Petrenko Mamba 64 1 0 25 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs Xavier Gonzalez Andrew Warrington Jimmy T.H. Smith Scott W. Linderman 81 8 0 17 Jan 2025
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models C. Conwell Rupert Tawiah-Quashie T. Ullman 68 2 0 26 Nov 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues Riccardo Grazzi Julien N. Siems Jörg K.H. Franke Arber Zela Frank Hutter Massimiliano Pontil 78 10 0 19 Nov 2024
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences Niklas Schmidinger Lisa Schneckenreiter Philipp Seidl Johannes Schimunek Pieter-Jan Hoedt Johannes Brandstetter Andreas Mayr Sohvi Luukkonen Sepp Hochreiter G. Klambauer MedIm 45 4 0 06 Nov 2024
Taipan: Efficient and Expressive State Space Language Models with Selective Attention Chien Van Nguyen Huy Huu Nguyen Thang M. Pham Ruiyi Zhang Hanieh Deilamsalehy ... Ryan A. Rossi Trung Bui Viet Dac Lai Franck Dernoncourt Thien Huu Nguyen Mamba RALM 21 1 0 24 Oct 2024
Stick-breaking Attention Shawn Tan Yikang Shen Songlin Yang Aaron C. Courville Rameswar Panda 25 4 0 23 Oct 2024
Can Mamba Always Enjoy the "Free Lunch"? Ruifeng Ren Zhicong Li Yong Liu 34 1 0 04 Oct 2024
Autoregressive Large Language Models are Computationally Universal Dale Schuurmans Hanjun Dai Francesco Zanini 25 2 0 04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models Thieu N. Vo Tung D. Pham Xin T. Tong Tan Minh Nguyen Mamba 44 0 0 04 Oct 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities Jerry Huang 38 7 0 11 Jul 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling Liliang Ren Yang Liu Yadong Lu Yelong Shen Chen Liang Weizhu Chen Mamba 62 54 0 11 Jun 2024
Evaluating the World Model Implicit in a Generative Model Keyon Vafa Justin Y. Chen Jon M. Kleinberg S. Mullainathan Ashesh Rambachan 84 25 0 06 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the end of the story Nicolas Zucchet Antonio Orvieto ODL AAML 29 8 0 31 May 2024
Language Models Need Inductive Biases to Count Inductively Yingshan Chang Yonatan Bisk LRM 29 5 0 30 May 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective Yash Sarrof Yana Veitsman Michael Hahn Mamba 30 6 0 27 May 2024
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks Jerome Sieber Carmen Amo Alonso A. Didier M. Zeilinger Antonio Orvieto AAML 31 7 0 24 May 2024
Theoretical Foundations of Deep Selective State-Space Models Nicola Muca Cirone Antonio Orvieto Benjamin Walker C. Salvi Terry Lyons Mamba 37 24 0 29 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt Jishnu Ray Chowdhury Cornelia Caragea 32 1 0 01 Feb 2024
Entity Tracking in Language Models Najoung Kim Sebastian Schuster 42 16 0 03 May 2023
A Logic for Expressing Log-Precision Transformers William Merrill Ashish Sabharwal ReLM NAI LRM 42 46 0 06 Oct 2022
Liquid Structural State-Space Models Ramin Hasani Mathias Lechner Tsun-Hsuan Wang Makram Chahine Alexander Amini Daniela Rus AI4TS 89 93 0 26 Sep 2022