ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.15419
  4. Cited By
Characterizing Learning Curves During Language Model Pre-Training:
  Learning, Forgetting, and Stability

Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability

29 August 2023
Tyler A. Chang
Z. Tu
Benjamin Bergen
ArXivPDFHTML

Papers citing "Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability"

8 / 8 papers shown
Title
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Bigram Subnetworks: Mapping to Next Tokens in Transformer Language Models
Tyler A. Chang
Benjamin Bergen
41
0
0
21 Apr 2025
A Distributional Perspective on Word Learning in Neural Language Models
A Distributional Perspective on Word Learning in Neural Language Models
Filippo Ficarra
Ryan Cotterell
Alex Warstadt
44
1
0
09 Feb 2025
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Babysit A Language Model From Scratch: Interactive Language Learning by Trials and Demonstrations
Ziqiao Ma
Zekun Wang
Joyce Chai
45
2
0
22 May 2024
In-context Learning and Induction Heads
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
240
453
0
24 Sep 2022
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
  Multilingual Language Models
Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of Multilingual Language Models
Terra Blevins
Hila Gonen
Luke Zettlemoyer
LRM
54
26
0
24 May 2022
Word Acquisition in Neural Language Models
Word Acquisition in Neural Language Models
Tyler A. Chang
Benjamin Bergen
27
38
0
05 Oct 2021
Frequency Effects on Syntactic Rule Learning in Transformers
Frequency Effects on Syntactic Rule Learning in Transformers
Jason W. Wei
Dan Garrette
Tal Linzen
Ellie Pavlick
80
62
0
14 Sep 2021
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
69
235
0
31 Dec 2020
1