Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2406.00048
Cited By
Towards a theory of how the structure of language is acquired by deep neural networks
28 May 2024
Francesco Cagnetta
M. Wyart
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards a theory of how the structure of language is acquired by deep neural networks"
5 / 5 papers shown
Title
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
68
7
0
17 Feb 2025
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
53
24
0
05 Feb 2024
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
41
36
0
02 Feb 2024
Do Transformers Parse while Predicting the Masked Word?
Haoyu Zhao
A. Panigrahi
Rong Ge
Sanjeev Arora
74
29
0
14 Mar 2023
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1