ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.08164
  4. Cited By
On Limitations of the Transformer Architecture

On Limitations of the Transformer Architecture

13 February 2024
Binghui Peng
Srini Narayanan
Christos H. Papadimitriou
ArXivPDFHTML

Papers citing "On Limitations of the Transformer Architecture"

28 / 28 papers shown
Title
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Lost in Transmission: When and Why LLMs Fail to Reason Globally
Tobias Schnabel
Kiran Tomlinson
Adith Swaminathan
Jennifer Neville
LRM
20
0
0
13 May 2025
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents
Procedural Memory Is Not All You Need: Bridging Cognitive Gaps in LLM-Based Agents
Schaun Wheeler
Olivier Jeunen
LLMAG
36
0
0
06 May 2025
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)
Concise One-Layer Transformers Can Do Function Evaluation (Sometimes)
Lena Strobl
Dana Angluin
Robert Frank
38
0
0
28 Mar 2025
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
Aabid Karim
Abdul Karim
Bhoomika Lohana
Matt Keon
Jaswinder Singh
A. Sattar
47
0
0
23 Mar 2025
The Role of Sparsity for Length Generalization in Transformers
The Role of Sparsity for Length Generalization in Transformers
Noah Golowich
Samy Jelassi
David Brandfonbrener
Sham Kakade
Eran Malach
37
0
0
24 Feb 2025
Provably Overwhelming Transformer Models with Designed Inputs
Provably Overwhelming Transformer Models with Designed Inputs
Lev Stambler
Seyed Sajjad Nezhadi
Matthew Coudron
74
0
0
09 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
119
0
0
04 Feb 2025
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
A. Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
47
1
0
31 Jan 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
71
0
0
27 Jan 2025
Ehrenfeucht-Haussler Rank and Chain of Thought
Ehrenfeucht-Haussler Rank and Chain of Thought
Pablo Barceló
A. Kozachinskiy
Tomasz Steifer
LRM
71
1
0
22 Jan 2025
A completely uniform transformer for parity
A completely uniform transformer for parity
A. Kozachinskiy
Tomasz Steifer
33
0
0
07 Jan 2025
Lower bounds on transformers with infinite precision
Lower bounds on transformers with infinite precision
Alexander Kozachinskiy
29
2
0
31 Dec 2024
Theoretical limitations of multi-layer Transformer
Theoretical limitations of multi-layer Transformer
Lijie Chen
Binghui Peng
Hongxun Wu
AI4CE
67
6
0
04 Dec 2024
The Asymptotic Behavior of Attention in Transformers
The Asymptotic Behavior of Attention in Transformers
Álvaro Rodríguez Abella
João Pedro Silvestre
Paulo Tabuada
61
3
0
03 Dec 2024
Do Large Language Models Perform Latent Multi-Hop Reasoning without
  Exploiting Shortcuts?
Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?
Sohee Yang
Nora Kassner
E. Gribovskaya
Sebastian Riedel
Mor Geva
KELM
LRM
ReLM
78
4
0
25 Nov 2024
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Shreya Shankar
Tristan Chambers
Eugene Wu
Aditya G. Parameswaran
Eugene Wu
LLMAG
53
6
0
16 Oct 2024
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in
  Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Iman Mirzadeh
Keivan Alizadeh
Hooman Shahrokhi
Oncel Tuzel
Samy Bengio
Mehrdad Farajtabar
AIMat
LRM
58
127
0
07 Oct 2024
Evaluating and explaining training strategies for zero-shot
  cross-lingual news sentiment analysis
Evaluating and explaining training strategies for zero-shot cross-lingual news sentiment analysis
Luka Andrenšek
Boshko Koloski
Andraz Pelicon
Nada Lavrac
Senja Pollak
Matthew Purver
21
1
0
30 Sep 2024
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Yotam Wolf
Binyamin Rothberg
Dorin Shteyman
Amnon Shashua
18
0
0
26 Sep 2024
One-layer transformers fail to solve the induction heads task
One-layer transformers fail to solve the induction heads task
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
16
8
0
26 Aug 2024
Can Large Language Models Reason? A Characterization via 3-SAT
Can Large Language Models Reason? A Characterization via 3-SAT
Rishi Hazra
Gabriele Venturato
Pedro Zuidberg Dos Martires
Luc de Raedt
ELM
ReLM
LRM
30
4
0
13 Aug 2024
When Can Transformers Count to n?
When Can Transformers Count to n?
Gilad Yehudai
Haim Kaplan
Asma Ghandeharioun
Mor Geva
Amir Globerson
32
10
0
21 Jul 2024
On the Design and Analysis of LLM-Based Algorithms
On the Design and Analysis of LLM-Based Algorithms
Yanxi Chen
Yaliang Li
Bolin Ding
Jingren Zhou
41
4
0
20 Jul 2024
Cognitive Map for Language Models: Optimal Planning via Verbally
  Representing the World Model
Cognitive Map for Language Models: Optimal Planning via Verbally Representing the World Model
Doyoung Kim
Jongwon Lee
Jinho Park
Minjoon Seo
LM&Ro
36
0
0
21 Jun 2024
The Expressive Capacity of State Space Models: A Formal Language
  Perspective
The Expressive Capacity of State Space Models: A Formal Language Perspective
Yash Sarrof
Yana Veitsman
Michael Hahn
Mamba
30
7
0
27 May 2024
Large Language Models for UAVs: Current State and Pathways to the Future
Large Language Models for UAVs: Current State and Pathways to the Future
Shumaila Javaid
Nasir Saeed
Bin He
32
16
0
02 May 2024
Mamba-360: Survey of State Space Models as Transformer Alternative for
  Long Sequence Modelling: Methods, Applications, and Challenges
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Badri N. Patro
Vijay Srinivas Agneeswaran
Mamba
30
38
0
24 Apr 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,402
0
28 Jan 2022
1