ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.09347
  4. Cited By
Separations in the Representational Capabilities of Transformers and
  Recurrent Architectures

Separations in the Representational Capabilities of Transformers and Recurrent Architectures

13 June 2024
S. Bhattamishra
Michael Hahn
Phil Blunsom
Varun Kanade
    GNN
ArXiv (abs)PDFHTML

Papers citing "Separations in the Representational Capabilities of Transformers and Recurrent Architectures"

19 / 19 papers shown
Hybrid Quantum-Classical Recurrent Neural Networks
Hybrid Quantum-Classical Recurrent Neural Networks
Wenduan Xu
169
0
0
29 Oct 2025
Benefits and Limitations of Communication in Multi-Agent Reasoning
Benefits and Limitations of Communication in Multi-Agent Reasoning
Michael Rizvi-Martel
S. Bhattamishra
Neil Rathi
Guillaume Rabusseau
Michael Hahn
LRM
94
0
0
14 Oct 2025
The Transformer Cookbook
The Transformer Cookbook
Andy Yang
Christopher Watson
Anton Xue
S. Bhattamishra
Jose Llarena
William Merrill
Emile Dos Santos Ferreira
Anej Svete
David Chiang
143
0
0
01 Oct 2025
Fast attention mechanisms: a tale of parallelism
Fast attention mechanisms: a tale of parallelism
Jingwen Liu
Hantao Yu
Clayton Sanford
Alexandr Andoni
Daniel J. Hsu
147
0
0
10 Sep 2025
Two Heads Are Better than One: Simulating Large Transformers with Small Ones
Two Heads Are Better than One: Simulating Large Transformers with Small Ones
Hantao Yu
Josh Alman
227
0
0
13 Jun 2025
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Born a Transformer -- Always a Transformer? On the Effect of Pretraining on Architectural Abilities
Yana Veitsman
Mayank Jobanputra
Yash Sarrof
Aleksandra Bakalova
Vera Demberg
Ellie Pavlick
Michael Hahn
471
2
0
27 May 2025
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers
Charles London
Varun Kanade
260
3
0
27 May 2025
PaTH Attention: Position Encoding via Accumulating Householder Transformations
PaTH Attention: Position Encoding via Accumulating Householder Transformations
Songlin Yang
Yikang Shen
Kaiyue Wen
Shawn Tan
Mayank Mishra
Liliang Ren
Rameswar Panda
Yoon Kim
866
12
0
22 May 2025
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
676
2
0
13 May 2025
When Do Transformers Outperform Feedforward and Recurrent Networks? A Statistical Perspective
Alireza Mousavi-Hosseini
Clayton Sanford
Denny Wu
Murat A. Erdogdu
347
3
0
14 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More MoreAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Arvid Frydenlund
LRM
547
2
0
13 Mar 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
1.2K
13
0
04 Feb 2025
Strassen Attention, Split VC Dimension and Compositionality in Transformers
Strassen Attention, Split VC Dimension and Compositionality in Transformers
Chris Köcher
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
423
4
0
31 Jan 2025
A completely uniform transformer for parity
A completely uniform transformer for parity
Chris Köcher
Tomasz Steifer
199
1
0
07 Jan 2025
Lower bounds on transformers with infinite precision
Lower bounds on transformers with infinite precision
Alexander Kozachinskiy
190
2
0
31 Dec 2024
Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues
Unlocking State-Tracking in Linear RNNs Through Negative EigenvaluesInternational Conference on Learning Representations (ICLR), 2024
Riccardo Grazzi
Julien N. Siems
Jörg Franke
Arber Zela
Katharina Eggensperger
Massimiliano Pontil
747
45
0
19 Nov 2024
Training Neural Networks as Recognizers of Formal Languages
Training Neural Networks as Recognizers of Formal LanguagesInternational Conference on Learning Representations (ICLR), 2024
Alexandra Butoi
Ghazal Khalighinejad
Anej Svete
Josef Valvoda
Robert Bamler
Brian DuSell
NAI
510
16
0
11 Nov 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
462
24
0
27 May 2024
Understanding the differences in Foundation Models: Attention, State
  Space Models, and Recurrent Neural Networks
Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks
Jerome Sieber
Carmen Amo Alonso
A. Didier
Melanie Zeilinger
Antonio Orvieto
AAML
378
23
0
24 May 2024
1