ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2406.00048
  4. Cited By
Towards a theory of how the structure of language is acquired by deep
  neural networks

Towards a theory of how the structure of language is acquired by deep neural networks

28 May 2024
Francesco Cagnetta
M. Wyart
ArXivPDFHTML

Papers citing "Towards a theory of how the structure of language is acquired by deep neural networks"

5 / 5 papers shown
Title
A distributional simplicity bias in the learning dynamics of transformers
A distributional simplicity bias in the learning dynamics of transformers
Riccardo Rende
Federica Gerace
A. Laio
Sebastian Goldt
68
7
0
17 Feb 2025
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
53
24
0
05 Feb 2024
A Dynamical Model of Neural Scaling Laws
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
C. Pehlevan
41
36
0
02 Feb 2024
Do Transformers Parse while Predicting the Masked Word?
Do Transformers Parse while Predicting the Masked Word?
Haoyu Zhao
A. Panigrahi
Rong Ge
Sanjeev Arora
74
29
0
14 Mar 2023
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1