ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.05913
  4. Cited By
Efficient Stagewise Pretraining via Progressive Subnetworks

Efficient Stagewise Pretraining via Progressive Subnetworks

8 February 2024
Abhishek Panigrahi
Nikunj Saunshi
Kaifeng Lyu
Sobhan Miryoosefi
Sashank J. Reddi
Satyen Kale
Sanjiv Kumar
ArXiv (abs)PDFHTMLGithub

Papers citing "Efficient Stagewise Pretraining via Progressive Subnetworks"

6 / 6 papers shown
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training
Michael Menezes
Barbara Su
Xinze Feng
Yehya Farhat
Hamza Shili
Anastasios Kyrillidis
223
1
0
06 Nov 2025
Curriculum-Guided Layer Scaling for Language Model Pretraining
Curriculum-Guided Layer Scaling for Language Model Pretraining
Karanpartap Singh
Neil Band
Ehsan Adeli
ALMLRM
285
1
0
13 Jun 2025
Efficient Knowledge Distillation via Curriculum Extraction
Efficient Knowledge Distillation via Curriculum Extraction
Shivam Gupta
Sushrut Karmalkar
372
3
0
21 Mar 2025
Upcycling Large Language Models into Mixture of Experts
Upcycling Large Language Models into Mixture of Experts
Ethan He
Syeda Nahida Akter
R. Prenger
V. Korthikanti
Zijie Yan
Tong Liu
Shiqing Fan
Ashwath Aithal
Mohammad Shoeybi
Bryan Catanzaro
MoE
551
36
0
10 Oct 2024
A Quadratic Synchronization Rule for Distributed Deep Learning
A Quadratic Synchronization Rule for Distributed Deep LearningInternational Conference on Learning Representations (ICLR), 2023
Xinran Gu
Kaifeng Lyu
Sanjeev Arora
Jingzhao Zhang
Longbo Huang
347
4
0
22 Oct 2023
TyDi QA: A Benchmark for Information-Seeking Question Answering in
  Typologically Diverse Languages
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse LanguagesTransactions of the Association for Computational Linguistics (TACL), 2020
J. Clark
Eunsol Choi
Michael Collins
Dan Garrette
Tom Kwiatkowski
Vitaly Nikolaev
J. Palomaki
699
711
0
10 Mar 2020
1
Page 1 of 1