ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1412.6544
  4. Cited By
Qualitatively characterizing neural network optimization problems

Qualitatively characterizing neural network optimization problems

19 December 2014
Ian Goodfellow
Oriol Vinyals
Andrew M. Saxe
    ODL
ArXivPDFHTML

Papers citing "Qualitatively characterizing neural network optimization problems"

50 / 111 papers shown
Title
Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling
Gregory W. Benton
Wesley J. Maddox
Sanae Lotfi
A. Wilson
UQCV
25
67
0
25 Feb 2021
Visualization of Nonlinear Programming for Robot Motion Planning
Visualization of Nonlinear Programming for Robot Motion Planning
David Hägele
Moataz Abdelaal
Ozgur S. Oguz
Marc Toussaint
Daniel Weiskopf
20
3
0
28 Jan 2021
Combating Mode Collapse in GAN training: An Empirical Analysis using
  Hessian Eigenvalues
Combating Mode Collapse in GAN training: An Empirical Analysis using Hessian Eigenvalues
Ricard Durall
Avraam Chatzimichailidis
P. Labus
J. Keuper
GAN
30
57
0
17 Dec 2020
PEP: Parameter Ensembling by Perturbation
PEP: Parameter Ensembling by Perturbation
Alireza Mehrtash
Purang Abolmaesumi
Polina Golland
Tina Kapur
Demian Wassermann
W. Wells
25
10
0
24 Oct 2020
Softmax Deep Double Deterministic Policy Gradients
Softmax Deep Double Deterministic Policy Gradients
Ling Pan
Qingpeng Cai
Longbo Huang
72
86
0
19 Oct 2020
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Gradient Flow in Sparse Neural Networks and How Lottery Tickets Win
Utku Evci
Yani Andrew Ioannou
Cem Keskin
Yann N. Dauphin
27
87
0
07 Oct 2020
A Comparative Study of Deep Learning Loss Functions for Multi-Label
  Remote Sensing Image Classification
A Comparative Study of Deep Learning Loss Functions for Multi-Label Remote Sensing Image Classification
Hichame Yessou
Gencer Sumbul
Begüm Demir
19
31
0
29 Sep 2020
Pruning artificial neural networks: a way to find well-generalizing,
  high-entropy sharp minima
Pruning artificial neural networks: a way to find well-generalizing, high-entropy sharp minima
Enzo Tartaglione
Andrea Bragagnolo
Marco Grangetto
31
11
0
30 Apr 2020
Symmetry & critical points for a model shallow neural network
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
34
13
0
23 Mar 2020
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep
  Network Losses
Critical Point-Finding Methods Reveal Gradient-Flat Regions of Deep Network Losses
Charles G. Frye
James B. Simon
Neha S. Wadia
A. Ligeralde
M. DeWeese
K. Bouchard
ODL
16
2
0
23 Mar 2020
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient
  Shaping
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
Sanghyun Hong
Varun Chandrasekaran
Yigitcan Kaya
Tudor Dumitras
Nicolas Papernot
AAML
28
136
0
26 Feb 2020
SNIFF: Reverse Engineering of Neural Networks with Fault Attacks
SNIFF: Reverse Engineering of Neural Networks with Fault Attacks
J. Breier
Dirmanto Jap
Xiaolu Hou
S. Bhasin
Yang Liu
17
52
0
23 Feb 2020
The Break-Even Point on Optimization Trajectories of Deep Neural
  Networks
The Break-Even Point on Optimization Trajectories of Deep Neural Networks
Stanislaw Jastrzebski
Maciej Szymczak
Stanislav Fort
Devansh Arpit
Jacek Tabor
Kyunghyun Cho
Krzysztof J. Geras
50
154
0
21 Feb 2020
Gradient Surgery for Multi-Task Learning
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
41
1,172
0
19 Jan 2020
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
25
168
0
19 Dec 2019
Deep Ensembles: A Loss Landscape Perspective
Deep Ensembles: A Loss Landscape Perspective
Stanislav Fort
Huiyi Hu
Balaji Lakshminarayanan
OOD
UQCV
29
617
0
05 Dec 2019
GradVis: Visualization and Second Order Analysis of Optimization
  Surfaces during the Training of Deep Neural Networks
GradVis: Visualization and Second Order Analysis of Optimization Surfaces during the Training of Deep Neural Networks
Avraam Chatzimichailidis
Franz-Josef Pfreundt
N. Gauger
J. Keuper
19
10
0
26 Sep 2019
Visualizing Movement Control Optimization Landscapes
Visualizing Movement Control Optimization Landscapes
Perttu Hämäläinen
Juuso Toikka
Amin Babadi
Karen Liu
19
7
0
17 Sep 2019
Visualizing and Understanding the Effectiveness of BERT
Visualizing and Understanding the Effectiveness of BERT
Y. Hao
Li Dong
Furu Wei
Ke Xu
27
181
0
15 Aug 2019
Visualizing the PHATE of Neural Networks
Visualizing the PHATE of Neural Networks
Scott A. Gigante
Adam S. Charles
Smita Krishnaswamy
Gal Mishne
36
37
0
07 Aug 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
23
55
0
05 Jul 2019
Decentralized Bayesian Learning over Graphs
Decentralized Bayesian Learning over Graphs
Anusha Lalitha
Xinghan Wang
O. Kilinc
Y. Lu
T. Javidi
F. Koushanfar
FedML
28
25
0
24 May 2019
Parameter Efficient Training of Deep Convolutional Neural Networks by
  Dynamic Sparse Reparameterization
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Hesham Mostafa
Xin Wang
37
307
0
15 Feb 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Haowei He
Gao Huang
Yang Yuan
ODL
MLT
25
147
0
02 Feb 2019
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
20
191
0
29 Oct 2018
Collaborative Deep Learning Across Multiple Data Centers
Collaborative Deep Learning Across Multiple Data Centers
Kele Xu
Haibo Mi
Dawei Feng
Huaimin Wang
Chuan Chen
Zibin Zheng
Xu Lan
FedML
92
17
0
16 Oct 2018
Distributed learning of deep neural network over multiple agents
Distributed learning of deep neural network over multiple agents
O. Gupta
Ramesh Raskar
FedML
OOD
6
597
0
14 Oct 2018
Bayesian Deep Convolutional Networks with Many Channels are Gaussian
  Processes
Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes
Roman Novak
Lechao Xiao
Jaehoon Lee
Yasaman Bahri
Greg Yang
Jiri Hron
Daniel A. Abolafia
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
25
306
0
11 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from
  Random Matrix Theory and Implications for Learning
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
35
191
0
02 Oct 2018
Interpreting Adversarial Robustness: A View from Decision Surface in
  Input Space
Interpreting Adversarial Robustness: A View from Decision Surface in Input Space
Fuxun Yu
Chenchen Liu
Yanzhi Wang
Liang Zhao
Xiang Chen
AAML
OOD
31
27
0
29 Sep 2018
Benchmarking five global optimization approaches for nano-optical shape
  optimization and parameter reconstruction
Benchmarking five global optimization approaches for nano-optical shape optimization and parameter reconstruction
Philipp‐Immanuel Schneider
Xavier Garcia Santiago
V. Soltwisch
M. Hammerschmidt
Sven Burger
C. Rockstuhl
11
88
0
18 Sep 2018
Don't Use Large Mini-Batches, Use Local SGD
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
57
429
0
22 Aug 2018
Troubling Trends in Machine Learning Scholarship
Troubling Trends in Machine Learning Scholarship
Zachary Chase Lipton
Jacob Steinhardt
26
288
0
09 Jul 2018
PCA of high dimensional random walks with comparison to neural network
  training
PCA of high dimensional random walks with comparison to neural network training
J. Antognini
Jascha Narain Sohl-Dickstein
OOD
24
27
0
22 Jun 2018
Using transfer learning to detect galaxy mergers
Using transfer learning to detect galaxy mergers
Sandro Ackermann
Kevin Schawinski
Ce Zhang
Anna K. Weigel
M. D. Turp
3DPC
6
110
0
25 May 2018
SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep
  Learning
SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning
W. Wen
Yandan Wang
Feng Yan
Cong Xu
Chunpeng Wu
Yiran Chen
H. Li
24
50
0
21 May 2018
Measuring the Intrinsic Dimension of Objective Landscapes
Measuring the Intrinsic Dimension of Objective Landscapes
Chunyuan Li
Heerad Farkhoor
Rosanne Liu
J. Yosinski
26
397
0
24 Apr 2018
The Loss Surface of XOR Artificial Neural Networks
The Loss Surface of XOR Artificial Neural Networks
D. Mehta
Xiaojun Zhao
Edgar A. Bernal
D. Wales
34
19
0
06 Apr 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better Generalization
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedML
MoMe
60
1,617
0
14 Mar 2018
A Walk with SGD
A Walk with SGD
Chen Xing
Devansh Arpit
Christos Tsirigotis
Yoshua Bengio
27
118
0
24 Feb 2018
signSGD: Compressed Optimisation for Non-Convex Problems
signSGD: Compressed Optimisation for Non-Convex Problems
Jeremy Bernstein
Yu-Xiang Wang
Kamyar Azizzadenesheli
Anima Anandkumar
FedML
ODL
44
1,020
0
13 Feb 2018
Visualizing the Loss Landscape of Neural Nets
Visualizing the Loss Landscape of Neural Nets
Hao Li
Zheng Xu
Gavin Taylor
Christoph Studer
Tom Goldstein
106
1,844
0
28 Dec 2017
Neon2: Finding Local Minima via First-Order Oracles
Neon2: Finding Local Minima via First-Order Oracles
Zeyuan Allen-Zhu
Yuanzhi Li
21
130
0
17 Nov 2017
Three Factors Influencing Minima in SGD
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
14
457
0
13 Nov 2017
Rethinking generalization requires revisiting old ideas: statistical
  mechanics approaches and complex learning behavior
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
Charles H. Martin
Michael W. Mahoney
AI4CE
30
62
0
26 Oct 2017
High-dimensional dynamics of generalization error in neural networks
High-dimensional dynamics of generalization error in neural networks
Madhu S. Advani
Andrew M. Saxe
AI4CE
78
464
0
10 Oct 2017
Natasha 2: Faster Non-Convex Optimization Than SGD
Natasha 2: Faster Non-Convex Optimization Than SGD
Zeyuan Allen-Zhu
ODL
28
245
0
29 Aug 2017
Are Saddles Good Enough for Deep Learning?
Are Saddles Good Enough for Deep Learning?
Adepu Ravi Sankar
V. Balasubramanian
35
5
0
07 Jun 2017
The loss surface of deep and wide neural networks
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
51
283
0
26 Apr 2017
Snapshot Ensembles: Train 1, get M for free
Snapshot Ensembles: Train 1, get M for free
Gao Huang
Yixuan Li
Geoff Pleiss
Zhuang Liu
J. Hopcroft
Kilian Q. Weinberger
OOD
FedML
UQCV
45
935
0
01 Apr 2017
Previous
123
Next