ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2404.08819
  4. Cited By
The Illusion of State in State-Space Models

The Illusion of State in State-Space Models

12 April 2024
William Merrill
Jackson Petty
Ashish Sabharwal
ArXivPDFHTML

Papers citing "The Illusion of State in State-Space Models"

32 / 32 papers shown
Title
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization
Ali Behrouz
Meisam Razaviyayn
Peilin Zhong
Vahab Mirrokni
31
0
0
17 Apr 2025
TRA: Better Length Generalisation with Threshold Relative Attention
TRA: Better Length Generalisation with Threshold Relative Attention
Mattia Opper
Roland Fernandez
P. Smolensky
Jianfeng Gao
37
0
0
29 Mar 2025
Fixed-Point RNNs: From Diagonal to Dense in a Few Iterations
Sajad Movahedi
Felix Sarnthein
Nicola Muca Cirone
Antonio Orvieto
46
2
0
13 Mar 2025
A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers
William Merrill
Ashish Sabharwal
42
4
0
05 Mar 2025
(How) Do Language Models Track State?
Belinda Z. Li
Zifan Carl Guo
Jacob Andreas
LRM
44
0
0
04 Mar 2025
Compositional Reasoning with Transformers, RNNs, and Chain of Thought
Gilad Yehudai
Noah Amsel
Joan Bruna
LRM
50
1
0
03 Mar 2025
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking
Yifan Zhang
Wenyu Du
Dongming Jin
Jie Fu
Zhi Jin
LRM
46
0
0
27 Feb 2025
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks
Thomas Schmied
Thomas Adler
Vihang Patil
M. Beck
Korbinian Poppel
Johannes Brandstetter
G. Klambauer
Razvan Pascanu
Sepp Hochreiter
55
4
0
21 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
78
0
0
04 Feb 2025
ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer
Lin Yueyu
Li Zhiyuan
Peter Yue
Liu Xiao
34
5
0
28 Jan 2025
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling
MambaTron: Efficient Cross-Modal Point Cloud Enhancement using Aggregate Selective State Space Modeling
Sai Tarun Inaganti
Gennady Petrenko
Mamba
64
1
0
25 Jan 2025
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Towards Scalable and Stable Parallelization of Nonlinear RNNs
Xavier Gonzalez
Andrew Warrington
Jimmy T.H. Smith
Scott W. Linderman
81
8
0
17 Jan 2025
Relations, Negations, and Numbers: Looking for Logic in Generative
  Text-to-Image Models
Relations, Negations, and Numbers: Looking for Logic in Generative Text-to-Image Models
C. Conwell
Rupert Tawiah-Quashie
T. Ullman
68
2
0
26 Nov 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Riccardo Grazzi
Julien N. Siems
Jörg K.H. Franke
Arber Zela
Frank Hutter
Massimiliano Pontil
78
10
0
19 Nov 2024
Bio-xLSTM: Generative modeling, representation and in-context learning
  of biological and chemical sequences
Bio-xLSTM: Generative modeling, representation and in-context learning of biological and chemical sequences
Niklas Schmidinger
Lisa Schneckenreiter
Philipp Seidl
Johannes Schimunek
Pieter-Jan Hoedt
Johannes Brandstetter
Andreas Mayr
Sohvi Luukkonen
Sepp Hochreiter
G. Klambauer
MedIm
45
4
0
06 Nov 2024
Taipan: Efficient and Expressive State Space Language Models with
  Selective Attention
Taipan: Efficient and Expressive State Space Language Models with Selective Attention
Chien Van Nguyen
Huy Huu Nguyen
Thang M. Pham
Ruiyi Zhang
Hanieh Deilamsalehy
...
Ryan A. Rossi
Trung Bui
Viet Dac Lai
Franck Dernoncourt
Thien Huu Nguyen
Mamba
RALM
21
1
0
24 Oct 2024
Stick-breaking Attention
Stick-breaking Attention
Shawn Tan
Yikang Shen
Songlin Yang
Aaron C. Courville
Rameswar Panda
25
4
0
23 Oct 2024
Can Mamba Always Enjoy the "Free Lunch"?
Can Mamba Always Enjoy the "Free Lunch"?
Ruifeng Ren
Zhicong Li
Yong Liu
34
1
0
04 Oct 2024
Autoregressive Large Language Models are Computationally Universal
Autoregressive Large Language Models are Computationally Universal
Dale Schuurmans
Hanjun Dai
Francesco Zanini
25
2
0
04 Oct 2024
Demystifying the Token Dynamics of Deep Selective State Space Models
Demystifying the Token Dynamics of Deep Selective State Space Models
Thieu N. Vo
Tung D. Pham
Xin T. Tong
Tan Minh Nguyen
Mamba
44
0
0
04 Oct 2024
How Well Can a Long Sequence Model Model Long Sequences? Comparing
  Architechtural Inductive Biases on Long-Context Abilities
How Well Can a Long Sequence Model Model Long Sequences? Comparing Architechtural Inductive Biases on Long-Context Abilities
Jerry Huang
38
7
0
11 Jul 2024
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren
Yang Liu
Yadong Lu
Yelong Shen
Chen Liang
Weizhu Chen
Mamba
62
54
0
11 Jun 2024
Evaluating the World Model Implicit in a Generative Model
Evaluating the World Model Implicit in a Generative Model
Keyon Vafa
Justin Y. Chen
Jon M. Kleinberg
S. Mullainathan
Ashesh Rambachan
84
25
0
06 Jun 2024
Recurrent neural networks: vanishing and exploding gradients are not the
  end of the story
Recurrent neural networks: vanishing and exploding gradients are not the end of the story
Nicolas Zucchet
Antonio Orvieto
ODL
AAML
29
8
0
31 May 2024
Language Models Need Inductive Biases to Count Inductively
Language Models Need Inductive Biases to Count Inductively
Yingshan Chang
Yonatan Bisk
LRM
29
5
0
30 May 2024
The Expressive Capacity of State Space Models: A Formal Language
  Perspective
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
30
6
0
27 May 2024
Understanding the differences in Foundation Models: Attention, State
  Space Models, and Recurrent Neural Networks
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
M. Zeilinger
Antonio Orvieto
AAML
31
7
0
24 May 2024
Theoretical Foundations of Deep Selective State-Space Models
Theoretical Foundations of Deep Selective State-Space Models
Nicola Muca Cirone
Antonio Orvieto
Benjamin Walker
C. Salvi
Terry Lyons
Mamba
37
24
0
29 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
32
1
0
01 Feb 2024
Entity Tracking in Language Models
Entity Tracking in Language Models
Najoung Kim
Sebastian Schuster
42
16
0
03 May 2023
A Logic for Expressing Log-Precision Transformers
A Logic for Expressing Log-Precision Transformers
William Merrill
Ashish Sabharwal
ReLM
NAI
LRM
42
46
0
06 Oct 2022
Liquid Structural State-Space Models
Liquid Structural State-Space Models
Ramin Hasani
Mathias Lechner
Tsun-Hsuan Wang
Makram Chahine
Alexander Amini
Daniela Rus
AI4TS
89
93
0
26 Sep 2022
1