ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.10487
  4. Cited By
Scaling Laws and Interpretability of Learning from Repeated Data

Scaling Laws and Interpretability of Learning from Repeated Data

21 May 2022
Danny Hernandez
Tom B. Brown
Tom Conerly
Nova Dassarma
Dawn Drain
S. E. Showk
Nelson Elhage
Zac Hatfield-Dodds
T. Henighan
Tristan Hume
Scott R. Johnston
Benjamin Mann
C. Olah
Catherine Olsson
Dario Amodei
Nicholas Joseph
Jared Kaplan
Sam McCandlish
ArXiv (abs)PDFHTMLHuggingFace (1 upvotes)

Papers citing "Scaling Laws and Interpretability of Learning from Repeated Data"

50 / 96 papers shown
Similarity Field Theory: A Mathematical Framework for Intelligence
Similarity Field Theory: A Mathematical Framework for Intelligence
Kei-Sing Ng
174
0
0
24 Dec 2025
Diffusion Language Models are Super Data Learners
Diffusion Language Models are Super Data Learners
Jinjie Ni
Qian Liu
Longxu Dou
Chao Du
Zili Wang
Hang Yan
Tianyu Pang
Michael Shieh
AI4CE
138
15
0
05 Nov 2025
Efficient Prediction of Pass@k Scaling in Large Language Models
Efficient Prediction of Pass@k Scaling in Large Language Models
Joshua Kazdan
Rylan Schaeffer
Youssef Allouah
Colin Sullivan
Kyssen Yu
Noam Levi
Sanmi Koyejo
OffRL
136
1
0
06 Oct 2025
Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
Scaling Behaviors of LLM Reinforcement Learning Post-Training: An Empirical Study in Mathematical Reasoning
Zelin Tan
Hejia Geng
M. Zhang
Xiaohang Yu
Guancheng Wan
...
G. Zhang
Chen Zhang
Z. Yin
Wenlong Zhang
Lei Bai
OffRLLRM
455
3
1
29 Sep 2025
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
Shane Bergsma
Nolan Dey
Joel Hestness
166
0
0
29 Sep 2025
Pretraining Scaling Laws for Generative Evaluations of Language Models
Pretraining Scaling Laws for Generative Evaluations of Language Models
Rylan Schaeffer
Noam Levi
Brando Miranda
Sanmi Koyejo
124
1
0
28 Sep 2025
Evaluating the Robustness of Chinchilla Compute-Optimal Scaling
Evaluating the Robustness of Chinchilla Compute-Optimal Scaling
Rylan Schaeffer
Noam Levi
Andreas Kirsch
Theo Guenais
Brando Miranda
Elyas Obbad
Sanmi Koyejo
LRM
187
0
0
28 Sep 2025
We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
Gautam Siddharth Kashyap
Mark Dras
Usman Naseem
LLMSV
194
2
0
26 Sep 2025
Large Language Models for Real-World IoT Device Identification
Large Language Models for Real-World IoT Device Identification
Rameen Mahmood
Tousif Ahmed
Sai Teja Peddinti
Danny Yuxing Huang
81
0
0
24 Sep 2025
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
How to inject knowledge efficiently? Knowledge Infusion Scaling Law for Pre-training Large Language Models
Kangtao Lv
Haibin Chen
Yujin Yuan
Langming Liu
Shilei Liu
Yongwei Wang
Yuchi Xu
B. Zheng
KELM
153
0
0
19 Sep 2025
EvoLM: In Search of Lost Language Model Training Dynamics
EvoLM: In Search of Lost Language Model Training Dynamics
Zhenting Qi
Fan Nie
Alexandre Alahi
James Zou
Himabindu Lakkaraju
Yilun Du
Eric P. Xing
Sham Kakade
Hanlin Zhang
312
3
0
19 Jun 2025
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Complexity Scaling Laws for Neural Models using Combinatorial Optimization
Lowell Weissman
Michael Krumdick
A. Lynn Abbott
294
0
0
15 Jun 2025
Scaling Laws of Motion Forecasting and Planning - Technical Report
Mustafa Baniodeh
Kratarth Goel
Scott Ettinger
Carlos Fuertes
Ari Seff
...
Vinutha Kallem
Sergio Casas
Rami Al-Rfou
Benjamin Sapp
Dragomir Anguelov
288
10
0
09 Jun 2025
The emergence of sparse attention: impact of data distribution and benefits of repetition
The emergence of sparse attention: impact of data distribution and benefits of repetition
Nicolas Zucchet
Francesco dÁngelo
Andrew Kyle Lampinen
Stephanie C. Y. Chan
439
5
0
23 May 2025
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation
Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form GenerationAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Chengwei Qin
Wenxuan Zhou
Karthik Abinav Sankararaman
Nanshu Wang
Tengyu Xu
...
Aditya Tayade
Sinong Wang
Shafiq Joty
Han Fang
Hao Ma
HILMLRM
271
0
0
18 May 2025
Induction Head Toxicity Mechanistically Explains Repetition Curse in Large Language Models
Induction Head Toxicity Mechanistically Explains Repetition Curse in Large Language Models
Shuxun Wang
Qingyu Yin
Chak Tou Leong
Qiang Zhang
Linyi Yang
276
3
0
17 May 2025
Superposition Yields Robust Neural Scaling
Superposition Yields Robust Neural Scaling
Yizhou Liu
Ziming Liu
Jeff Gore
MILM
639
4
0
15 May 2025
Parallel Scaling Law for Language Models
Parallel Scaling Law for Language Models
Mouxiang Chen
Binyuan Hui
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
Jianling Sun
Junyang Lin
Zhongxin Liu
MoELRM
337
20
0
15 May 2025
xGen-small Technical Report
xGen-small Technical Report
Erik Nijkamp
Bo Pang
Egor Pakhomov
Akash Gokul
Jin Qu
Silvio Savarese
Yingbo Zhou
Caiming Xiong
LLMAG
385
1
0
10 May 2025
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance
Takuya Tamura
Taro Yano
Masafumi Enomoto
Masafumi Oyamada
179
0
0
28 Apr 2025
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
QuaDMix: Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining
Fengze Liu
Weidong Zhou
Binbin Liu
Zhimiao Yu
Yifan Zhang
...
Yifeng Yu
Bingni Zhang
Xiaohuan Zhou
Taifeng Wang
Yong Cao
402
5
0
23 Apr 2025
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
ToReMi: Topic-Aware Data Reweighting for Dynamic Pre-Training Data Selection
Xiaoxuan Zhu
Zhouhong Gu
Baiqian Wu
Suhang Zheng
Tao Wang
Tianyu Li
Hongwei Feng
Yanghua Xiao
522
1
0
01 Apr 2025
(Mis)Fitting: A Survey of Scaling Laws
(Mis)Fitting: A Survey of Scaling Laws
Margaret Li
Sneha Kudugunta
Luke Zettlemoyer
413
12
0
26 Feb 2025
Scaling Laws for Downstream Task Performance in Machine Translation
Scaling Laws for Downstream Task Performance in Machine TranslationInternational Conference on Learning Representations (ICLR), 2024
Berivan Isik
Natalia Ponomareva
Hussein Hazimeh
Dimitris Paparas
Sergei Vassilvitskii
Sanmi Koyejo
315
23
0
24 Feb 2025
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
MATH-Perturb: Benchmarking LLMs' Math Reasoning Abilities against Hard Perturbations
Kaixuan Huang
Jiacheng Guo
Zihao Li
X. Ji
Jiawei Ge
...
Yangsibo Huang
Chi Jin
Xinyun Chen
Chiyuan Zhang
Mengdi Wang
AAMLLRM
655
53
0
10 Feb 2025
CoddLLM: Empowering Large Language Models for Data Analytics
CoddLLM: Empowering Large Language Models for Data Analytics
Jiani Zhang
Hengrui Zhang
Rishav Chakravarti
Yiqun Hu
Patrick Ng
Asterios Katsifodimos
Huzefa Rangwala
George Karypis
Alon Halevy
SyDaELM
899
5
0
01 Feb 2025
Training Compute-Optimal Protein Language Models
Training Compute-Optimal Protein Language ModelsbioRxiv (bioRxiv), 2024
Xingyi Cheng
Bo Chen
Pan Li
Jing Gong
Jie Tang
Le Song
312
29
0
04 Nov 2024
Optimizing Low-Resource Language Model Training: Comprehensive Analysis
  of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Optimizing Low-Resource Language Model Training: Comprehensive Analysis of Multi-Epoch, Multi-Lingual, and Two-Stage Approaches
Kosuke Akimoto
Masafumi Oyamada
204
0
0
16 Oct 2024
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Adaptive Data Optimization: Dynamic Sample Selection with Scaling LawsInternational Conference on Learning Representations (ICLR), 2024
Yiding Jiang
Allan Zhou
Zhili Feng
Sadhika Malladi
J. Zico Kolter
233
33
0
15 Oct 2024
Scaling Laws for Predicting Downstream Performance in LLMs
Scaling Laws for Predicting Downstream Performance in LLMs
Yangyi Chen
Binxuan Huang
Yifan Gao
Zhengyang Wang
Jingfeng Yang
Heng Ji
LRM
365
26
0
11 Oct 2024
Emergent properties with repeated examples
Emergent properties with repeated examples
Francois Charton
Julia Kempe
AIMat
244
7
0
09 Oct 2024
Mechanistic?
Mechanistic?BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackBoxNLP), 2024
Naomi Saphra
Sarah Wiegreffe
AI4CE
260
32
0
07 Oct 2024
Training Language Models on the Knowledge Graph: Insights on
  Hallucinations and Their Detectability
Training Language Models on the Knowledge Graph: Insights on Hallucinations and Their Detectability
Jiri Hron
Laura J. Culp
Gamaleldin F. Elsayed
Rosanne Liu
Ben Adlam
...
T. Warkentin
Lechao Xiao
Kelvin Xu
Jasper Snoek
Simon Kornblith
166
3
0
14 Aug 2024
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks
Xingcheng Xu
Zibo Zhao
Haipeng Zhang
Yanqing Yang
LRM
261
0
0
25 Jul 2024
A Survey on Symbolic Knowledge Distillation of Large Language Models
A Survey on Symbolic Knowledge Distillation of Large Language Models
Kamal Acharya
Alvaro Velasquez
Haoze Song
SyDa
288
23
0
12 Jul 2024
SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language
  Model Pre-training
SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training
Nan He
Weichen Xiong
Hanwen Liu
Yi Liao
Lei Ding
Kai Zhang
Guohua Tang
Xiao Han
Wei Yang
185
5
0
09 Jul 2024
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at
  Scale
Breaking Language Barriers: Cross-Lingual Continual Pre-Training at Scale
Wenzhen Zheng
Wenbo Pan
Xu Xu
Libo Qin
Li Yue
Ming Zhou
CLL
227
13
0
02 Jul 2024
Collaborative Performance Prediction for Large Language Models
Collaborative Performance Prediction for Large Language Models
Qiyuan Zhang
Fuyuan Lyu
Xue Liu
Chen Ma
199
6
0
01 Jul 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Songlin Yang
306
21
0
21 Jun 2024
Towards an Improved Understanding and Utilization of Maximum Manifold
  Capacity Representations
Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations
Rylan Schaeffer
Victor Lecomte
Dhruv Pai
Andres Carranza
Berivan Isik
...
Yann LeCun
SueYeon Chung
Andrey Gromov
Ravid Shwartz-Ziv
Sanmi Koyejo
276
9
0
13 Jun 2024
Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey
Unique Security and Privacy Threats of Large Language Models: A Comprehensive Survey
Shang Wang
Tianqing Zhu
B. Liu
Ming Ding
Dayong Ye
Dayong Ye
Wanlei Zhou
PILM
385
22
0
12 Jun 2024
Large Language Model-guided Document Selection
Large Language Model-guided Document Selection
Xiang Kong
Tom Gunter
Ruoming Pang
192
7
0
07 Jun 2024
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model
  Series
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series
Ge Zhang
Scott Qu
Jiaheng Liu
Chenchen Zhang
Chenghua Lin
...
Zi-Kai Zhao
Jiajun Zhang
Wanli Ouyang
Wenhao Huang
Lei Ma
ELM
313
72
0
29 May 2024
A Survey of Multimodal Large Language Model from A Data-centric
  Perspective
A Survey of Multimodal Large Language Model from A Data-centric Perspective
Tianyi Bai
Hao Liang
Binwang Wan
Yanran Xu
Xi Li
...
Ping Huang
Jiulong Shan
Conghui He
Binhang Yuan
Wentao Zhang
384
64
0
26 May 2024
360Zhinao Technical Report
360Zhinao Technical Report
360Zhinao Team
218
0
0
22 May 2024
Token-wise Influential Training Data Retrieval for Large Language Models
Token-wise Influential Training Data Retrieval for Large Language Models
Huawei Lin
Jikai Long
Zhaozhuo Xu
Weijie Zhao
242
11
0
20 May 2024
LMD3: Language Model Data Density Dependence
LMD3: Language Model Data Density Dependence
John Kirchenbauer
Garrett Honke
Gowthami Somepalli
Jonas Geiping
Daphne Ippolito
Katherine Lee
Tom Goldstein
David Andre
232
8
0
10 May 2024
Mechanistic Interpretability for AI Safety -- A Review
Mechanistic Interpretability for AI Safety -- A Review
Leonard Bereska
E. Gavves
AI4CE
335
298
0
22 Apr 2024
When Life gives you LLMs, make LLM-ADE: Large Language Models with
  Adaptive Data Engineering
When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering
Stephen Choi
William Gazeley
KELM
181
4
0
19 Apr 2024
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from
  Human Feedback for LLMs
RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs
Shreyas Chaudhari
Pranjal Aggarwal
Vishvak Murahari
Tanmay Rajpurohit
Ashwin Kalyan
Karthik Narasimhan
Ameet Deshpande
Bruno Castro da Silva
407
90
0
12 Apr 2024
12
Next