Grokking as the Transition from Lazy to Rich Training DynamicsInternational Conference on Learning Representations (ICLR), 2023 |
Are Emergent Abilities of Large Language Models a Mirage?Neural Information Processing Systems (NeurIPS), 2023 |
Progress measures for grokking via mechanistic interpretabilityInternational Conference on Learning Representations (ICLR), 2023 |
Grokking phase transitions in learning local rules with gradient descentJournal of machine learning research (JMLR), 2022 |
Towards Understanding Grokking: An Effective Theory of Representation
LearningNeural Information Processing Systems (NeurIPS), 2022 |
Microsoft COCO: Common Objects in ContextEuropean Conference on Computer Vision (ECCV), 2014 |