ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2408.08944
  4. Cited By
Information-Theoretic Progress Measures reveal Grokking is an Emergent
  Phase Transition

Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition

16 August 2024
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
ArXivPDFHTML

Papers citing "Information-Theoretic Progress Measures reveal Grokking is an Emergent Phase Transition"

6 / 6 papers shown
Title
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Beyond the Next Token: Towards Prompt-Robust Zero-Shot Classification via Efficient Multi-Token Prediction
Junlang Qian
Zixiao Zhu
Hanzhang Zhou
Zijian Feng
Zepeng Zhai
K. Mao
AAML
VLM
38
0
0
04 Apr 2025
Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory
  Cortex
Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex
Tanishq Kumar
Blake Bordelon
C. Pehlevan
Venkatesh N. Murthy
Samuel Gershman
OOD
CLL
SSL
43
0
0
05 Nov 2024
Deconstructing the Goldilocks Zone of Neural Network Initialization
Deconstructing the Goldilocks Zone of Neural Network Initialization
Artem Vysogorets
Anna Dawid
Julia Kempe
30
1
0
05 Feb 2024
Higher-order mutual information reveals synergistic sub-networks for
  multi-neuron importance
Higher-order mutual information reveals synergistic sub-networks for multi-neuron importance
Kenzo Clauw
S. Stramaglia
Daniele Marinazzo
SSL
FAtt
22
6
0
01 Nov 2022
Quantifying Local Specialization in Deep Neural Networks
Quantifying Local Specialization in Deep Neural Networks
Shlomi Hod
Daniel Filan
Stephen Casper
Andrew Critch
Stuart J. Russell
58
10
0
13 Oct 2021
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
273
2,878
0
15 Sep 2016
1