Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2306.02896
Cited By
Representational Strengths and Limitations of Transformers
5 June 2023
Clayton Sanford
Daniel J. Hsu
Matus Telgarsky
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Representational Strengths and Limitations of Transformers"
16 / 16 papers shown
Title
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
137
0
0
04 Feb 2025
Strassen Attention: Unlocking Compositional Abilities in Transformers Based on a New Lower Bound Method
A. Kozachinskiy
Felipe Urrutia
Hector Jimenez
Tomasz Steifer
Germán Pizarro
Matías Fuentes
Francisco Meza
Cristian Buc
Cristóbal Rojas
47
1
0
31 Jan 2025
Approximation Rate of the Transformer Architecture for Sequence Modeling
Hao Jiang
Qianxiao Li
46
9
0
03 Jan 2025
Lower bounds on transformers with infinite precision
Alexander Kozachinskiy
34
2
0
31 Dec 2024
Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning
Md Rifat Arefin
G. Subbaraj
Nicolas Angelard-Gontier
Yann LeCun
Irina Rish
Ravid Shwartz-Ziv
C. Pal
LRM
96
0
0
04 Nov 2024
Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective
Yotam Wolf
Binyamin Rothberg
Dorin Shteyman
Amnon Shashua
18
0
0
26 Sep 2024
When Can Transformers Count to n?
Gilad Yehudai
Haim Kaplan
Asma Ghandeharioun
Mor Geva
Amir Globerson
37
10
0
21 Jul 2024
When big data actually are low-rank, or entrywise approximation of certain function-generated matrices
Stanislav Budzinskiy
62
2
0
03 Jul 2024
Separations in the Representational Capabilities of Transformers and Recurrent Architectures
S. Bhattamishra
Michael Hahn
Phil Blunsom
Varun Kanade
GNN
33
9
0
13 Jun 2024
On the Theoretical Expressive Power and the Design Space of Higher-Order Graph Transformers
Cai Zhou
Rose Yu
Yusu Wang
32
7
0
04 Apr 2024
Implicit Bias and Fast Convergence Rates for Self-attention
Bhavya Vasudeva
Puneesh Deora
Christos Thrampoulidis
24
13
0
08 Feb 2024
Sample, estimate, aggregate: A recipe for causal discovery foundation models
Menghua Wu
Yujia Bao
Regina Barzilay
Tommi Jaakkola
CML
41
7
0
02 Feb 2024
An Information-Theoretic Analysis of In-Context Learning
Hong Jun Jeon
Jason D. Lee
Qi Lei
Benjamin Van Roy
22
18
0
28 Jan 2024
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao-quan Song
Chiwun Yang
28
29
0
23 Aug 2023
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
C. Qi
Hao Su
Kaichun Mo
Leonidas J. Guibas
3DH
3DPC
3DV
PINN
222
14,087
0
02 Dec 2016
Benefits of depth in neural networks
Matus Telgarsky
123
602
0
14 Feb 2016
1