Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2107.13163
Cited By
Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers
28 July 2021
Colin Wei
Yining Chen
Tengyu Ma
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers"
15 / 15 papers shown
Title
Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights
Zhaiming Shen
Alex Havrilla
Rongjie Lai
A. Cloninger
Wenjing Liao
39
0
0
06 May 2025
Looped ReLU MLPs May Be All You Need as Practical Programmable Computers
Yingyu Liang
Zhizhou Sha
Zhenmei Shi
Zhao-quan Song
Yufa Zhou
91
18
0
21 Feb 2025
Lower Bounds for Chain-of-Thought Reasoning in Hard-Attention Transformers
Alireza Amiri
Xinting Huang
Mark Rofin
Michael Hahn
LRM
155
0
0
04 Feb 2025
Are Transformers Able to Reason by Connecting Separated Knowledge in Training Data?
Yutong Yin
Zhaoran Wang
LRM
ReLM
125
0
0
27 Jan 2025
When Can Transformers Count to n?
Gilad Yehudai
Haim Kaplan
Asma Ghandeharioun
Mor Geva
Amir Globerson
39
10
0
21 Jul 2024
Representing Rule-based Chatbots with Transformers
Dan Friedman
Abhishek Panigrahi
Danqi Chen
61
1
0
15 Jul 2024
U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models
Song Mei
3DV
AI4CE
DiffM
39
11
0
29 Apr 2024
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
Licong Lin
Yu Bai
Song Mei
OffRL
30
42
0
12 Oct 2023
Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization
Kaiyue Wen
Zhiyuan Li
Tengyu Ma
FAtt
28
26
0
20 Jul 2023
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
Hongkang Li
M. Wang
Sijia Liu
Pin-Yu Chen
ViT
MLT
35
56
0
12 Feb 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
45
11
0
30 Dec 2022
Same Pre-training Loss, Better Downstream: Implicit Bias Matters for Language Models
Hong Liu
Sang Michael Xie
Zhiyuan Li
Tengyu Ma
AI4CE
32
49
0
25 Oct 2022
Inductive Biases and Variable Creation in Self-Attention Mechanisms
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Cyril Zhang
27
115
0
19 Oct 2021
Benefits of depth in neural networks
Matus Telgarsky
125
602
0
14 Feb 2016
Norm-Based Capacity Control in Neural Networks
Behnam Neyshabur
Ryota Tomioka
Nathan Srebro
114
577
0
27 Feb 2015
1