Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
1910.02366
Cited By
v1
v2
v3 (latest)
Splitting Steepest Descent for Growing Neural Architectures
Neural Information Processing Systems (NeurIPS), 2019
6 October 2019
Qiang Liu
Lemeng Wu
Dilin Wang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Splitting Steepest Descent for Growing Neural Architectures"
42 / 42 papers shown
Shared-Weights Extender and Gradient Voting for Neural Network Expansion
Nikolas Chatzis
I. Kordonis
Manos Theodosis
Petros Maragos
107
0
0
23 Sep 2025
Saddle Hierarchy in Dense Associative Memory
Robin Thériault
Daniele Tantari
54
2
0
26 Aug 2025
Flat Channels to Infinity in Neural Loss Landscapes
Flavio Martinelli
Alexander Van Meegen
Berfin Simsek
W. Gerstner
Johanni Brea
306
2
0
17 Jun 2025
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
205
0
0
07 Nov 2024
Growing Efficient Accurate and Robust Neural Networks on the Edge
Vignesh Sundaresha
Naresh Shanbhag
274
0
0
10 Oct 2024
Growing Deep Neural Network Considering with Similarity between Neurons
Taigo Sakai
Kazuhiro Hotta
152
0
0
23 Aug 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Ying Tai
Tingwen Liu
Shuohuan Wang
Yu Sun
176
7
0
03 Jun 2024
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally
Manon Verbockhaven
Sylvain Chevallier
Guillaume Charpiat
224
8
0
30 May 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
120
0
0
19 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
517
2
0
08 Feb 2024
Preparing Lessons for Progressive Training on Language Models
AAAI Conference on Artificial Intelligence (AAAI), 2024
Yu Pan
Ye Yuan
Yichun Yin
Jiaxin Shi
Zenglin Xu
Ming Zhang
Lifeng Shang
Xin Jiang
Qun Liu
263
13
0
17 Jan 2024
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep Neural Networks
Haihang Wu
Wei Wang
T. Malepathirana
Damith A. Senanayake
D. Oetomo
Saman K. Halgamuge
197
2
0
06 Jan 2024
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model Splitting
International Conference on Machine Learning (ICML), 2023
Anthony Chen
Huanrui Yang
Yulu Gan
Denis A. Gudovskiy
Zhen Dong
Haofan Wang
Tomoyuki Okuno
Yohei Nakata
Kurt Keutzer
Shanghang Zhang
213
4
0
14 Dec 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
202
0
0
27 Nov 2023
MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters
IEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Chau Pham
Piotr Teterwak
Soren Nelson
Bryan A. Plummer
195
5
0
07 Nov 2023
Reusing Pretrained Models by Multi-linear Operators for Efficient Training
Yu Pan
Ye Yuan
Yichun Yin
Zenglin Xu
Lifeng Shang
Xin Jiang
Qun Liu
248
17
0
16 Oct 2023
Energy Concerns with HPC Systems and Applications
Roblex Nana
C. Tadonki
Petr Dokladal
Youssef Mesri
194
0
0
31 Aug 2023
Self-Expanding Neural Networks
Rupert Mitchell
Robin Menzenbach
Kristian Kersting
Martin Mundt
315
12
0
10 Jul 2023
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate Adaptation
Neural Information Processing Systems (NeurIPS), 2023
Xin Yuan
Pedro H. P. Savarese
Michael Maire
164
5
0
22 Jun 2023
Learning to Grow Pretrained Models for Efficient Transformer Training
International Conference on Learning Representations (ICLR), 2023
Peihao Wang
Yikang Shen
Lucas Torroba Hennigen
P. Greengard
Leonid Karlinsky
Rogerio Feris
David D. Cox
Zinan Lin
Yoon Kim
199
70
0
02 Mar 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
International Conference on Machine Learning (ICML), 2023
Ghada Sokar
Rishabh Agarwal
Pablo Samuel Castro
Utku Evci
CLL
292
129
0
24 Feb 2023
Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs
Robert Joseph George
Jiawei Zhao
Jean Kossaifi
Zong-Yi Li
Anima Anandkumar
AI4CE
245
20
0
28 Nov 2022
Streamable Neural Fields
European Conference on Computer Vision (ECCV), 2022
Junwoo Cho
Seungtae Nam
Daniel Rho
J. Ko
Eunbyung Park
AI4TS
187
18
0
20 Jul 2022
Staged Training for Transformer Language Models
International Conference on Machine Learning (ICML), 2022
Sheng Shen
Pete Walsh
Kurt Keutzer
Jesse Dodge
Matthew E. Peters
Iz Beltagy
155
48
0
11 Mar 2022
Auto-scaling Vision Transformers without Training
International Conference on Learning Representations (ICLR), 2022
Wuyang Chen
Wei-Ping Huang
Xianzhi Du
Xiaodan Song
Zinan Lin
Denny Zhou
ViT
150
26
0
24 Feb 2022
Sparsity Winning Twice: Better Robust Generalization from More Efficient Training
International Conference on Learning Representations (ICLR), 2022
Tianlong Chen
Zhenyu Zhang
Pengju Wang
Santosh Balachandra
Haoyu Ma
Zehao Wang
Zinan Lin
OOD
AAML
354
52
0
20 Feb 2022
When, where, and how to add new neurons to ANNs
Kaitlin Maile
Emmanuel Rachelson
H. Luga
Dennis G. Wilson
163
18
0
17 Feb 2022
Growing Neural Network with Shared Parameter
Ruilin Tong
FedML
113
0
0
17 Jan 2022
GradMax: Growing Neural Networks using Gradient Information
International Conference on Learning Representations (ICLR), 2022
Utku Evci
B. V. Merrienboer
Thomas Unterthiner
Max Vladymyrov
Fabian Pedregosa
303
65
0
13 Jan 2022
Growing Representation Learning
Ryan N. King
Bobak J. Mortazavi
CLL
OOD
84
0
0
17 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
218
73
0
14 Oct 2021
PINNup: Robust neural network wavefield solutions using frequency upscaling and neuron splitting
Xinquan Huang
T. Alkhalifah
186
66
0
29 Sep 2021
On Anytime Learning at Macroscale
Lucas Caccia
Jing Xu
Myle Ott
MarcÁurelio Ranzato
Ludovic Denoyer
333
28
0
17 Jun 2021
Differentiable Neural Architecture Search with Morphism-based Transformable Backbone Architectures
Renlong Jie
Junbin Gao
113
1
0
14 Jun 2021
The Elastic Lottery Ticket Hypothesis
Neural Information Processing Systems (NeurIPS), 2021
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Jingjing Liu
Zinan Lin
OOD
302
37
0
30 Mar 2021
Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks
Neural Information Processing Systems (NeurIPS), 2021
Lemeng Wu
Bo Liu
Peter Stone
Qiang Liu
211
63
0
17 Feb 2021
Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices
Conference on Machine Learning and Systems (MLSys), 2021
Urmish Thakker
P. Whatmough
Zhi-Gang Liu
Matthew Mattina
Jesse G. Beu
131
7
0
14 Feb 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Annual Meeting of the Association for Computational Linguistics (ACL), 2020
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
499
250
0
31 Dec 2020
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
152
2
0
17 Jul 2020
Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting
Lemeng Wu
Mao Ye
Qi Lei
Jason D. Lee
Qiang Liu
288
15
0
23 Mar 2020
Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent
Dilin Wang
Meng Li
Lemeng Wu
Vikas Chandra
Qiang Liu
254
25
0
07 Oct 2019
Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Yanwei Fu
Chen Liu
Donghao Li
Zuyuan Zhong
Xinwei Sun
Jinshan Zeng
Xingtai Lv
254
13
0
23 May 2019
1