ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.02366
  4. Cited By
Splitting Steepest Descent for Growing Neural Architectures
v1v2v3 (latest)

Splitting Steepest Descent for Growing Neural Architectures

Neural Information Processing Systems (NeurIPS), 2019
6 October 2019
Qiang Liu
Lemeng Wu
Dilin Wang
ArXiv (abs)PDFHTML

Papers citing "Splitting Steepest Descent for Growing Neural Architectures"

42 / 42 papers shown
Shared-Weights Extender and Gradient Voting for Neural Network Expansion
Shared-Weights Extender and Gradient Voting for Neural Network Expansion
Nikolas Chatzis
I. Kordonis
Manos Theodosis
Petros Maragos
107
0
0
23 Sep 2025
Saddle Hierarchy in Dense Associative Memory
Saddle Hierarchy in Dense Associative Memory
Robin Thériault
Daniele Tantari
54
2
0
26 Aug 2025
Flat Channels to Infinity in Neural Loss Landscapes
Flat Channels to Infinity in Neural Loss Landscapes
Flavio Martinelli
Alexander Van Meegen
Berfin Simsek
W. Gerstner
Johanni Brea
306
2
0
17 Jun 2025
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
205
0
0
07 Nov 2024
Growing Efficient Accurate and Robust Neural Networks on the Edge
Growing Efficient Accurate and Robust Neural Networks on the Edge
Vignesh Sundaresha
Naresh Shanbhag
274
0
0
10 Oct 2024
Growing Deep Neural Network Considering with Similarity between Neurons
Growing Deep Neural Network Considering with Similarity between Neurons
Taigo Sakai
Kazuhiro Hotta
152
0
0
23 Aug 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via
  Adaptive Heads Fusion
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Ying Tai
Tingwen Liu
Shuohuan Wang
Yu Sun
176
7
0
03 Jun 2024
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them
  Optimally
Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally
Manon Verbockhaven
Sylvain Chevallier
Guillaume Charpiat
221
8
0
30 May 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
120
0
0
19 Feb 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
517
2
0
08 Feb 2024
Preparing Lessons for Progressive Training on Language Models
Preparing Lessons for Progressive Training on Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Yu Pan
Ye Yuan
Yichun Yin
Jiaxin Shi
Zenglin Xu
Ming Zhang
Lifeng Shang
Xin Jiang
Qun Liu
263
13
0
17 Jan 2024
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep
  Neural Networks
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep Neural Networks
Haihang Wu
Wei Wang
T. Malepathirana
Damith A. Senanayake
D. Oetomo
Saman K. Halgamuge
197
2
0
06 Jan 2024
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model
  Splitting
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model SplittingInternational Conference on Machine Learning (ICML), 2023
Anthony Chen
Huanrui Yang
Yulu Gan
Denis A. Gudovskiy
Zhen Dong
Haofan Wang
Tomoyuki Okuno
Yohei Nakata
Kurt Keutzer
Shanghang Zhang
213
4
0
14 Dec 2023
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
SensLI: Sensitivity-Based Layer Insertion for Neural Networks
Evelyn Herberg
Roland A. Herzog
Frederik Köhne
Leonie Kreis
Anton Schiela
202
0
0
27 Nov 2023
MixtureGrowth: Growing Neural Networks by Recombining Learned Parameters
MixtureGrowth: Growing Neural Networks by Recombining Learned ParametersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023
Chau Pham
Piotr Teterwak
Soren Nelson
Bryan A. Plummer
195
5
0
07 Nov 2023
Reusing Pretrained Models by Multi-linear Operators for Efficient
  Training
Reusing Pretrained Models by Multi-linear Operators for Efficient Training
Yu Pan
Ye Yuan
Yichun Yin
Zenglin Xu
Lifeng Shang
Xin Jiang
Qun Liu
248
17
0
16 Oct 2023
Energy Concerns with HPC Systems and Applications
Energy Concerns with HPC Systems and Applications
Roblex Nana
C. Tadonki
Petr Dokladal
Youssef Mesri
191
0
0
31 Aug 2023
Self-Expanding Neural Networks
Self-Expanding Neural Networks
Rupert Mitchell
Robin Menzenbach
Kristian Kersting
Martin Mundt
314
12
0
10 Jul 2023
Accelerated Training via Incrementally Growing Neural Networks using
  Variance Transfer and Learning Rate Adaptation
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate AdaptationNeural Information Processing Systems (NeurIPS), 2023
Xin Yuan
Pedro H. P. Savarese
Michael Maire
164
5
0
22 Jun 2023
Learning to Grow Pretrained Models for Efficient Transformer Training
Learning to Grow Pretrained Models for Efficient Transformer TrainingInternational Conference on Learning Representations (ICLR), 2023
Peihao Wang
Yikang Shen
Lucas Torroba Hennigen
P. Greengard
Leonid Karlinsky
Rogerio Feris
David D. Cox
Zinan Lin
Yoon Kim
199
70
0
02 Mar 2023
The Dormant Neuron Phenomenon in Deep Reinforcement Learning
The Dormant Neuron Phenomenon in Deep Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023
Ghada Sokar
Rishabh Agarwal
Pablo Samuel Castro
Utku Evci
CLL
292
129
0
24 Feb 2023
Incremental Spatial and Spectral Learning of Neural Operators for
  Solving Large-Scale PDEs
Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs
Robert Joseph George
Jiawei Zhao
Jean Kossaifi
Zong-Yi Li
Anima Anandkumar
AI4CE
245
20
0
28 Nov 2022
Streamable Neural Fields
Streamable Neural FieldsEuropean Conference on Computer Vision (ECCV), 2022
Junwoo Cho
Seungtae Nam
Daniel Rho
J. Ko
Eunbyung Park
AI4TS
184
18
0
20 Jul 2022
Staged Training for Transformer Language Models
Staged Training for Transformer Language ModelsInternational Conference on Machine Learning (ICML), 2022
Sheng Shen
Pete Walsh
Kurt Keutzer
Jesse Dodge
Matthew E. Peters
Iz Beltagy
152
48
0
11 Mar 2022
Auto-scaling Vision Transformers without Training
Auto-scaling Vision Transformers without TrainingInternational Conference on Learning Representations (ICLR), 2022
Wuyang Chen
Wei-Ping Huang
Xianzhi Du
Xiaodan Song
Zinan Lin
Denny Zhou
ViT
150
26
0
24 Feb 2022
Sparsity Winning Twice: Better Robust Generalization from More Efficient
  Training
Sparsity Winning Twice: Better Robust Generalization from More Efficient TrainingInternational Conference on Learning Representations (ICLR), 2022
Tianlong Chen
Zhenyu Zhang
Pengju Wang
Santosh Balachandra
Haoyu Ma
Zehao Wang
Zinan Lin
OODAAML
354
52
0
20 Feb 2022
When, where, and how to add new neurons to ANNs
When, where, and how to add new neurons to ANNs
Kaitlin Maile
Emmanuel Rachelson
H. Luga
Dennis G. Wilson
163
18
0
17 Feb 2022
Growing Neural Network with Shared Parameter
Growing Neural Network with Shared Parameter
Ruilin Tong
FedML
113
0
0
17 Jan 2022
GradMax: Growing Neural Networks using Gradient Information
GradMax: Growing Neural Networks using Gradient InformationInternational Conference on Learning Representations (ICLR), 2022
Utku Evci
B. V. Merrienboer
Thomas Unterthiner
Max Vladymyrov
Fabian Pedregosa
303
65
0
13 Jan 2022
Growing Representation Learning
Growing Representation Learning
Ryan N. King
Bobak J. Mortazavi
CLLOOD
84
0
0
17 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
215
73
0
14 Oct 2021
PINNup: Robust neural network wavefield solutions using frequency
  upscaling and neuron splitting
PINNup: Robust neural network wavefield solutions using frequency upscaling and neuron splitting
Xinquan Huang
T. Alkhalifah
186
65
0
29 Sep 2021
On Anytime Learning at Macroscale
On Anytime Learning at Macroscale
Lucas Caccia
Jing Xu
Myle Ott
MarcÁurelio Ranzato
Ludovic Denoyer
333
28
0
17 Jun 2021
Differentiable Neural Architecture Search with Morphism-based
  Transformable Backbone Architectures
Differentiable Neural Architecture Search with Morphism-based Transformable Backbone Architectures
Renlong Jie
Junbin Gao
113
1
0
14 Jun 2021
The Elastic Lottery Ticket Hypothesis
The Elastic Lottery Ticket HypothesisNeural Information Processing Systems (NeurIPS), 2021
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Jingjing Liu
Zinan Lin
OOD
301
37
0
30 Mar 2021
Firefly Neural Architecture Descent: a General Approach for Growing
  Neural Networks
Firefly Neural Architecture Descent: a General Approach for Growing Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Lemeng Wu
Bo Liu
Peter Stone
Qiang Liu
211
63
0
17 Feb 2021
Doping: A technique for efficient compression of LSTM models using
  sparse structured additive matrices
Doping: A technique for efficient compression of LSTM models using sparse structured additive matricesConference on Machine Learning and Systems (MLSys), 2021
Urmish Thakker
P. Whatmough
Zhi-Gang Liu
Matthew Mattina
Jesse G. Beu
131
7
0
14 Feb 2021
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT QuantizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
499
250
0
31 Dec 2020
A Differential Game Theoretic Neural Optimizer for Training Residual
  Networks
A Differential Game Theoretic Neural Optimizer for Training Residual Networks
Guan-Horng Liu
T. Chen
Evangelos A. Theodorou
151
2
0
17 Jul 2020
Steepest Descent Neural Architecture Optimization: Escaping Local
  Optimum with Signed Neural Splitting
Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting
Lemeng Wu
Mao Ye
Qi Lei
Jason D. Lee
Qiang Liu
288
15
0
23 Mar 2020
Energy-Aware Neural Architecture Optimization with Fast Splitting
  Steepest Descent
Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent
Dilin Wang
Meng Li
Lemeng Wu
Vikas Chandra
Qiang Liu
254
25
0
07 Oct 2019
Exploring Structural Sparsity of Deep Networks via Inverse Scale Spaces
Exploring Structural Sparsity of Deep Networks via Inverse Scale SpacesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
Yanwei Fu
Chen Liu
Donghao Li
Zuyuan Zhong
Xinwei Sun
Jinshan Zeng
Xingtai Lv
254
13
0
23 May 2019
1