v1v2v3 (latest)

Splitting Steepest Descent for Growing Neural Architectures

Neural Information Processing Systems (NeurIPS), 2019

6 October 2019

Papers citing "Splitting Steepest Descent for Growing Neural Architectures"

42 / 42 papers shown

Shared-Weights Extender and Gradient Voting for Neural Network Expansion

107

23 Sep 2025

Saddle Hierarchy in Dense Associative Memory

Robin Thériault

Daniele Tantari

26 Aug 2025

Flat Channels to Infinity in Neural Loss Landscapes

306

17 Jun 2025

Learning Morphisms with Gauss-Newton Approximation for Growing Networks

Neal Lawton

Aram Galstyan

Greg Ver Steeg

205

07 Nov 2024

Growing Efficient Accurate and Robust Neural Networks on the Edge

Vignesh Sundaresha

Naresh Shanbhag

274

10 Oct 2024

Growing Deep Neural Network Considering with Similarity between Neurons

Taigo Sakai

Kazuhiro Hotta

152

23 Aug 2024

DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion

176

03 Jun 2024

Growing Tiny Networks: Spotting Expressivity Bottlenecks and Fixing Them Optimally

Manon Verbockhaven

Sylvain Chevallier

Guillaume Charpiat

224

30 May 2024

Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures

120

19 Feb 2024

Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding

517

08 Feb 2024

Preparing Lessons for Progressive Training on Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024

Lifeng Shang

Xin Jiang

Qun Liu

263

17 Jan 2024

When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep Neural Networks

197

06 Jan 2024

Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model SplittingInternational Conference on Machine Learning (ICML), 2023

Huanrui Yang

Zhen Dong

Shanghang Zhang

213

14 Dec 2023

SensLI: Sensitivity-Based Layer Insertion for Neural Networks

202

27 Nov 2023

MixtureGrowth: Growing Neural Networks by Recombining Learned ParametersIEEE Workshop/Winter Conference on Applications of Computer Vision (WACV), 2023

195

07 Nov 2023

Reusing Pretrained Models by Multi-linear Operators for Efficient Training

Lifeng Shang

Xin Jiang

Qun Liu

248

16 Oct 2023

Energy Concerns with HPC Systems and Applications

194

31 Aug 2023

Self-Expanding Neural Networks

315

10 Jul 2023

Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate AdaptationNeural Information Processing Systems (NeurIPS), 2023

Xin Yuan

Pedro H. P. Savarese

Michael Maire

164

22 Jun 2023

Learning to Grow Pretrained Models for Efficient Transformer TrainingInternational Conference on Learning Representations (ICLR), 2023

Peihao Wang

Yikang Shen

Lucas Torroba Hennigen

199

02 Mar 2023

The Dormant Neuron Phenomenon in Deep Reinforcement LearningInternational Conference on Machine Learning (ICML), 2023

292

129

24 Feb 2023

Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

245

28 Nov 2022

Streamable Neural FieldsEuropean Conference on Computer Vision (ECCV), 2022

187

20 Jul 2022

Staged Training for Transformer Language ModelsInternational Conference on Machine Learning (ICML), 2022

Pete Walsh

155

11 Mar 2022

Auto-scaling Vision Transformers without TrainingInternational Conference on Learning Representations (ICLR), 2022

Xianzhi Du

150

24 Feb 2022

Sparsity Winning Twice: Better Robust Generalization from More Efficient TrainingInternational Conference on Learning Representations (ICLR), 2022

354

20 Feb 2022

When, where, and how to add new neurons to ANNs

163

17 Feb 2022

Growing Neural Network with Shared Parameter

Ruilin Tong

FedML

113

17 Jan 2022

GradMax: Growing Neural Networks using Gradient InformationInternational Conference on Learning Representations (ICLR), 2022

303

13 Jan 2022

Growing Representation Learning

Ryan N. King

Bobak J. Mortazavi

CLL OOD

17 Oct 2021

bert2BERT: Towards Reusable Pretrained Language Models

Cheng Chen

Yichun Yin

Lifeng Shang

Xin Jiang

Zhiyuan Liu

Qun Liu

VLM

218

14 Oct 2021

PINNup: Robust neural network wavefield solutions using frequency upscaling and neuron splitting

Xinquan Huang

T. Alkhalifah

186

29 Sep 2021

On Anytime Learning at Macroscale

333

17 Jun 2021

Differentiable Neural Architecture Search with Morphism-based Transformable Backbone Architectures

Renlong Jie

Junbin Gao

113

14 Jun 2021

The Elastic Lottery Ticket HypothesisNeural Information Processing Systems (NeurIPS), 2021

302

30 Mar 2021

Firefly Neural Architecture Descent: a General Approach for Growing Neural NetworksNeural Information Processing Systems (NeurIPS), 2021

211

17 Feb 2021

Doping: A technique for efficient compression of LSTM models using sparse structured additive matricesConference on Machine Learning and Systems (MLSys), 2021

131

14 Feb 2021

BinaryBERT: Pushing the Limit of BERT QuantizationAnnual Meeting of the Association for Computational Linguistics (ACL), 2020

Lifeng Shang

Xin Jiang

Qun Liu

Michael Lyu

Irwin King

499

250

31 Dec 2020

A Differential Game Theoretic Neural Optimizer for Training Residual Networks

Guan-Horng Liu

T. Chen

Evangelos A. Theodorou

152

17 Jul 2020

Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting

288

23 Mar 2020

Energy-Aware Neural Architecture Optimization with Fast Splitting Steepest Descent

254

07 Oct 2019

Exploring Structural Sparsity of Deep Networks via Inverse Scale SpacesIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019

254

23 May 2019