Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.08164
Cited By
On Limitations of the Transformer Architecture
13 February 2024
Binghui Peng
Srini Narayanan
Christos H. Papadimitriou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On Limitations of the Transformer Architecture"
28 / 28 papers shown
Title
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
20
0
0
13 May 2025
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents
Schaun Wheeler
Olivier Jeunen
LLMAG
38
0
0
06 May 2025
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)
Lena Strobl
Dana Angluin
Robert Frank
38
0
0
28 Mar 2025
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Aabid Karim
Abdul Karim
Bhoomika Lohana
Matt Keon
Jaswinder Singh
A. Sattar
47
0
0
23 Mar 2025
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
Provably Overwhelming Transformer Models with Designed Inputs
Lev Stambler
Seyed Sajjad Nezhadi
Matthew Coudron
74
0
0
09 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
119
0
0
04 Feb 2025
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
A. Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
47
1
0
31 Jan 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
74
0
0
27 Jan 2025
Ehrenfeucht-Haussler Rank and Chain of Thought
Pablo Barceló
A. Kozachinskiy
Tomasz Steifer
LRM
71
1
0
22 Jan 2025
A completely uniform transformer for parity
A. Kozachinskiy
Tomasz Steifer
33
0
0
07 Jan 2025
Lower bounds on transformers with infinite precision
Alexander Kozachinskiy
29
2
0
31 Dec 2024
Theoretical limitations of multi-layer Transformer
Lijie Chen
Binghui Peng
Hongxun Wu
AI4CE
67
6
0
04 Dec 2024
The Asymptotic Behavior of Attention in Transformers
Álvaro Rodríguez Abella
João Pedro Silvestre
Paulo Tabuada
61
3
0
03 Dec 2024
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Sohee Yang
Nora Kassner
E. Gribovskaya
Sebastian Riedel
Mor Geva
KELM
LRM
ReLM
78
4
0
25 Nov 2024
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Shreya Shankar
Tristan Chambers
Eugene Wu
Aditya G. Parameswaran
Eugene Wu
LLMAG
53
6
0
16 Oct 2024
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
AIMat
LRM
58
127
0
07 Oct 2024
Evaluating and explaining training strategies for zero-shot cross-lingual news sentiment analysis
Luka Andrenšek
Boshko Koloski
Andraz Pelicon
Nada Lavrac
Senja Pollak
Matthew Purver
21
1
0
30 Sep 2024
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Yotam Wolf
Binyamin Rothberg
Dorin Shteyman
Amnon Shashua
18
0
0
26 Sep 2024
One-layer transformers fail to solve the induction heads task
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
18
8
0
26 Aug 2024
Can Large Language Models Reason? A Characterization via 3-SAT
Rishi Hazra
Gabriele Venturato
Pedro Zuidberg Dos Martires
Luc de Raedt
ELM
ReLM
LRM
30
4
0
13 Aug 2024
When Can Transformers Count to n?
Gilad Yehudai
Haim Kaplan
Asma Ghandeharioun
Mor Geva
Amir Globerson
32
10
0
21 Jul 2024
On the Design and Analysis of LLM-Based Algorithms
Yanxi Chen
Yaliang Li
Bolin Ding
Jingren Zhou
41
4
0
20 Jul 2024
Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model
Doyoung Kim
Jongwon Lee
Jinho Park
Minjoon Seo
LM&Ro
36
0
0
21 Jun 2024
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
30
7
0
27 May 2024
Large Language Models for UAVs: Current State and Pathways to the Future
Shumaila Javaid
Nasir Saeed
Bin He
32
16
0
02 May 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
30
38
0
24 Apr 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
1