ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2003.10392
  4. Cited By
Steepest Descent Neural Architecture Optimization: Escaping Local
  Optimum with Signed Neural Splitting
v1v2v3v4v5 (latest)

Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting

23 March 2020
Lemeng Wu
Mao Ye
Qi Lei
Jason D. Lee
Qiang Liu
ArXiv (abs)PDFHTML

Papers citing "Steepest Descent Neural Architecture Optimization: Escaping Local Optimum with Signed Neural Splitting"

14 / 14 papers shown
Flat Channels to Infinity in Neural Loss Landscapes
Flat Channels to Infinity in Neural Loss Landscapes
Flavio Martinelli
Alexander Van Meegen
Berfin Simsek
W. Gerstner
Johanni Brea
300
2
0
17 Jun 2025
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
205
0
0
07 Nov 2024
Unified Gradient-Based Machine Unlearning with Remain Geometry
  Enhancement
Unified Gradient-Based Machine Unlearning with Remain Geometry EnhancementNeural Information Processing Systems (NeurIPS), 2024
Zhehao Huang
Xinwen Cheng
Jinghao Zheng
Haoran Wang
Zhengbao He
Tao Li
Xiaolin Huang
MU
247
26
0
29 Sep 2024
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural
  Architectures
Beyond Uniform Scaling: Exploring Depth Heterogeneity in Neural Architectures
Akash Guna R.T
Arnav Chavan
Deepak Gupta
MDE
120
0
0
19 Feb 2024
Preparing Lessons for Progressive Training on Language Models
Preparing Lessons for Progressive Training on Language ModelsAAAI Conference on Artificial Intelligence (AAAI), 2024
Yu Pan
Ye Yuan
Yichun Yin
Jiaxin Shi
Zenglin Xu
Ming Zhang
Lifeng Shang
Xin Jiang
Qun Liu
263
13
0
17 Jan 2024
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep
  Neural Networks
When To Grow? A Fitting Risk-Aware Policy for Layer Growing in Deep Neural Networks
Haihang Wu
Wei Wang
T. Malepathirana
Damith A. Senanayake
D. Oetomo
Saman K. Halgamuge
197
2
0
06 Jan 2024
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model
  Splitting
Split-Ensemble: Efficient OOD-aware Ensemble via Task and Model SplittingInternational Conference on Machine Learning (ICML), 2023
Anthony Chen
Huanrui Yang
Yulu Gan
Denis A. Gudovskiy
Zhen Dong
Haofan Wang
Tomoyuki Okuno
Yohei Nakata
Kurt Keutzer
Shanghang Zhang
213
4
0
14 Dec 2023
Reusing Pretrained Models by Multi-linear Operators for Efficient
  Training
Reusing Pretrained Models by Multi-linear Operators for Efficient Training
Yu Pan
Ye Yuan
Yichun Yin
Zenglin Xu
Lifeng Shang
Xin Jiang
Qun Liu
248
17
0
16 Oct 2023
Accelerated Training via Incrementally Growing Neural Networks using
  Variance Transfer and Learning Rate Adaptation
Accelerated Training via Incrementally Growing Neural Networks using Variance Transfer and Learning Rate AdaptationNeural Information Processing Systems (NeurIPS), 2023
Xin Yuan
Pedro H. P. Savarese
Michael Maire
164
5
0
22 Jun 2023
Sparsity Winning Twice: Better Robust Generalization from More Efficient
  Training
Sparsity Winning Twice: Better Robust Generalization from More Efficient TrainingInternational Conference on Learning Representations (ICLR), 2022
Tianlong Chen
Zhenyu Zhang
Pengju Wang
Santosh Balachandra
Haoyu Ma
Zehao Wang
Zinan Lin
OODAAML
353
52
0
20 Feb 2022
GradMax: Growing Neural Networks using Gradient Information
GradMax: Growing Neural Networks using Gradient InformationInternational Conference on Learning Representations (ICLR), 2022
Utku Evci
B. V. Merrienboer
Thomas Unterthiner
Max Vladymyrov
Fabian Pedregosa
302
65
0
13 Jan 2022
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
215
73
0
14 Oct 2021
Firefly Neural Architecture Descent: a General Approach for Growing
  Neural Networks
Firefly Neural Architecture Descent: a General Approach for Growing Neural NetworksNeural Information Processing Systems (NeurIPS), 2021
Lemeng Wu
Bo Liu
Peter Stone
Qiang Liu
211
63
0
17 Feb 2021
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of
  Winning Tickets is Enough
Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is EnoughNeural Information Processing Systems (NeurIPS), 2020
Mao Ye
Lemeng Wu
Qiang Liu
136
17
0
29 Oct 2020
1