Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2403.04978
Cited By
Stacking as Accelerated Gradient Descent
20 February 2025
Naman Agarwal
Pranjal Awasthi
Satyen Kale
Eric Zhao
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stacking as Accelerated Gradient Descent"
2 / 2 papers shown
Title
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild
Xinyu Zhao
Guoheng Sun
Ruisi Cai
Yukun Zhou
Pingzhi Li
...
Binhang Yuan
Hongyi Wang
Ang Li
Zhangyang Wang
Tianlong Chen
MoMe
ALM
13
0
0
07 Oct 2024
On the Transformer Growth for Progressive BERT Training
Xiaotao Gu
Liyuan Liu
Hongkun Yu
Jing Li
C. L. P. Chen
Jiawei Han
VLM
61
49
0
23 Oct 2020
1