Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.15419
Cited By
Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability
29 August 2023
Tyler A. Chang
Z. Tu
Benjamin Bergen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability"
8 / 8 papers shown
Title
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
41
0
0
21 Apr 2025
A Distributional Perspective on Word Learning in Neural Language Models
Filippo Ficarra
Ryan Cotterell
Alex Warstadt
44
1
0
09 Feb 2025
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Ziqiao Ma
Zekun Wang
Joyce Chai
45
2
0
22 May 2024
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
456
0
24 Sep 2022
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
Terra Blevins
Hila Gonen
Luke Zettlemoyer
LRM
54
26
0
24 May 2022
Word Acquisition in Neural Language Models
Tyler A. Chang
Benjamin Bergen
27
38
0
05 Oct 2021
Frequency Effects on Syntactic Rule Learning in Transformers
Jason W. Wei
Dan Garrette
Tal Linzen
Ellie Pavlick
80
62
0
14 Sep 2021
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
234
0
31 Dec 2020
1