Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

7 June 2024

Vignesh Kothapalli

Papers citing "Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise"

10 / 10 papers shown

Title
Model Balancing Helps Low-data Training and Fine-tuning Zihang Liu Y. Hu Tianyu Pang Yefan Zhou Pu Ren Yaoqing Yang 29 2 0 16 Oct 2024
AlphaPruning: Using Heavy-Tailed Self Regularization Theory for Improved Layer-wise Pruning of Large Language Models Haiquan Lu Yefan Zhou Shiwei Liu Zhangyang Wang Michael W. Mahoney Yaoqing Yang 13 0 0 14 Oct 2024
AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality Peijun Qing Chongyang Gao Yefan Zhou Xingjian Diao Yaoqing Yang Soroush Vosoughi MoMe MoE 19 3 0 14 Oct 2024
Asymptotics of feature learning in two-layer networks after one gradient-step Hugo Cui Luca Pesce Yatin Dandi Florent Krzakala Yue M. Lu Lenka Zdeborová Bruno Loureiro MLT 44 16 0 07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents Yatin Dandi Emanuele Troiani Luca Arnaboldi Luca Pesce Lenka Zdeborová Florent Krzakala MLT 59 24 0 05 Feb 2024
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training Yefan Zhou Tianyu Pang Keqin Liu Charles H. Martin Michael W. Mahoney Yaoqing Yang 34 7 0 01 Dec 2023
Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be Frederik Kunstner Jacques Chen J. Lavington Mark W. Schmidt 38 66 0 27 Apr 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 160 65 0 27 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 319 48 0 29 Sep 2022
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 150 232 0 04 Mar 2020