Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2506.21551
Cited By
v1
v2
v3 (latest)
Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test
26 June 2025
Ziyue Li
Chenrui Fan
Tianyi Zhou
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (27 upvotes)
Papers citing
"Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test"
1 / 1 papers shown
Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models
Guinan Su
Yanwu Yang
Li Shen
Lu Yin
Shiwei Liu
Jonas Geiping
MoE
KELM
180
2
0
16 Oct 2025
1