ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.10026
  4. Cited By
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
v1v2v3v4 (latest)

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Neural Information Processing Systems (NeurIPS), 2018
27 February 2018
T. Garipov
Pavel Izmailov
Dmitrii Podoprikhin
Dmitry Vetrov
A. Wilson
    UQCV
ArXiv (abs)PDFHTML

Papers citing "Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs"

48 / 548 papers shown
A study of local optima for learning feature interactions using neural
  networks
A study of local optima for learning feature interactions using neural networksIEEE International Joint Conference on Neural Network (IJCNN), 2020
Yangzi Guo
Adrian Barbu
239
1
0
11 Feb 2020
SQWA: Stochastic Quantized Weight Averaging for Improving the
  Generalization Capability of Low-Precision Deep Neural Networks
SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural NetworksIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2020
Sungho Shin
Yoonho Boo
Wonyong Sung
MQ
141
4
0
02 Feb 2020
Parameter Space Factorization for Zero-Shot Learning across Tasks and
  Languages
Parameter Space Factorization for Zero-Shot Learning across Tasks and LanguagesTransactions of the Association for Computational Linguistics (TACL), 2020
Edoardo Ponti
Ivan Vulić
Robert Bamler
Marinela Parović
Roi Reichart
Anna Korhonen
BDL
331
30
0
30 Jan 2020
The Case for Bayesian Deep Learning
The Case for Bayesian Deep Learning
A. Wilson
UQCVBDLOOD
288
121
0
29 Jan 2020
On Last-Layer Algorithms for Classification: Decoupling Representation
  from Uncertainty Estimation
On Last-Layer Algorithms for Classification: Decoupling Representation from Uncertainty Estimation
N. Brosse
C. Riquelme
Alice Martin
Sylvain Gelly
Eric Moulines
BDLOODUQCV
169
36
0
22 Jan 2020
Stochastic Weight Averaging in Parallel: Large-Batch Training that
  Generalizes Well
Stochastic Weight Averaging in Parallel: Large-Batch Training that Generalizes WellInternational Conference on Learning Representations (ICLR), 2020
Vipul Gupta
S. Serrano
D. DeCoste
MoMe
290
73
0
07 Jan 2020
Landscape Connectivity and Dropout Stability of SGD Solutions for
  Over-parameterized Neural Networks
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural NetworksInternational Conference on Machine Learning (ICML), 2019
Aleksandr Shevchenko
Marco Mondelli
433
41
0
20 Dec 2019
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Tian Ding
ODL
343
178
0
19 Dec 2019
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Linear Mode Connectivity and the Lottery Ticket HypothesisInternational Conference on Machine Learning (ICML), 2019
Jonathan Frankle
Gintare Karolina Dziugaite
Daniel M. Roy
Michael Carbin
MoMe
799
706
0
11 Dec 2019
Deep Ensembles: A Loss Landscape Perspective
Deep Ensembles: A Loss Landscape Perspective
Stanislav Fort
Huiyi Hu
Balaji Lakshminarayanan
OODUQCV
445
700
0
05 Dec 2019
Semi-Supervised Learning for Text Classification by Layer Partitioning
Semi-Supervised Learning for Text Classification by Layer PartitioningIEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2019
Alexander Hanbo Li
A. Sethy
139
12
0
26 Nov 2019
Rigging the Lottery: Making All Tickets Winners
Rigging the Lottery: Making All Tickets WinnersInternational Conference on Machine Learning (ICML), 2019
Utku Evci
Trevor Gale
Jacob Menick
Pablo Samuel Castro
Erich Elsen
539
688
0
25 Nov 2019
Sub-Optimal Local Minima Exist for Neural Networks with Almost All
  Non-Linear Activations
Sub-Optimal Local Minima Exist for Neural Networks with Almost All Non-Linear Activations
Tian Ding
Dawei Li
Tian Ding
343
14
0
04 Nov 2019
Loss Patterns of Neural Networks
Loss Patterns of Neural Networks
Ivan Skorokhodov
Andrey Kravchenko
3DPC
174
18
0
09 Oct 2019
Pure and Spurious Critical Points: a Geometric Study of Linear Networks
Pure and Spurious Critical Points: a Geometric Study of Linear NetworksInternational Conference on Learning Representations (ICLR), 2019
Matthew Trager
Kathlén Kohn
Joan Bruna
185
35
0
03 Oct 2019
Generalization Bounds for Convolutional Neural Networks
Generalization Bounds for Convolutional Neural Networks
Shan Lin
Jingwei Zhang
MLT
139
36
0
03 Oct 2019
How noise affects the Hessian spectrum in overparameterized neural
  networks
How noise affects the Hessian spectrum in overparameterized neural networks
Ming-Bo Wei
D. Schwab
259
32
0
01 Oct 2019
Lookahead Optimizer: k steps forward, 1 step back
Lookahead Optimizer: k steps forward, 1 step backNeural Information Processing Systems (NeurIPS), 2019
Michael Ruogu Zhang
James Lucas
Geoffrey E. Hinton
Jimmy Ba
ODL
491
816
0
19 Jul 2019
Subspace Inference for Bayesian Deep Learning
Subspace Inference for Bayesian Deep LearningConference on Uncertainty in Artificial Intelligence (UAI), 2019
Pavel Izmailov
Wesley J. Maddox
Polina Kirichenko
T. Garipov
Dmitry Vetrov
A. Wilson
UQCVBDL
275
155
0
17 Jul 2019
Towards Understanding Generalization in Gradient-Based Meta-Learning
Towards Understanding Generalization in Gradient-Based Meta-Learning
Simon Guiroy
Vikas Verma
C. Pal
172
22
0
16 Jul 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
298
65
0
05 Jul 2019
The Difficulty of Training Sparse Neural Networks
The Difficulty of Training Sparse Neural Networks
Utku Evci
Fabian Pedregosa
Aidan Gomez
Erich Elsen
336
108
0
25 Jun 2019
Homogeneous Vector Capsules Enable Adaptive Gradient Descent in
  Convolutional Neural Networks
Homogeneous Vector Capsules Enable Adaptive Gradient Descent in Convolutional Neural NetworksIEEE Access (IEEE Access), 2019
Adam Byerly
T. Kalganova
244
14
0
20 Jun 2019
Finding the Needle in the Haystack with Convolutions: on the benefits of
  architectural bias
Finding the Needle in the Haystack with Convolutions: on the benefits of architectural biasNeural Information Processing Systems (NeurIPS), 2019
Stéphane dÁscoli
Levent Sagun
Joan Bruna
Giulio Biroli
183
37
0
16 Jun 2019
Explaining Landscape Connectivity of Low-cost Solutions for Multilayer
  Nets
Explaining Landscape Connectivity of Low-cost Solutions for Multilayer NetsNeural Information Processing Systems (NeurIPS), 2019
Rohith Kuditipudi
Xiang Wang
Holden Lee
Yi Zhang
Zhiyuan Li
Wei Hu
Sanjeev Arora
Rong Ge
FAtt
424
101
0
14 Jun 2019
Large Scale Structure of Neural Network Loss Landscapes
Large Scale Structure of Neural Network Loss LandscapesNeural Information Processing Systems (NeurIPS), 2019
Stanislav Fort
Stanislaw Jastrzebski
230
94
0
11 Jun 2019
A Direct Approach to Robust Deep Learning Using Adversarial Networks
A Direct Approach to Robust Deep Learning Using Adversarial NetworksInternational Conference on Learning Representations (ICLR), 2019
Huaxia Wang
Chun-Nam Yu
GANAAMLOOD
167
81
0
23 May 2019
Budgeted Training: Rethinking Deep Neural Network Training Under
  Resource Constraints
Budgeted Training: Rethinking Deep Neural Network Training Under Resource ConstraintsInternational Conference on Learning Representations (ICLR), 2019
Mengtian Li
Ersin Yumer
Deva Ramanan
258
54
0
12 May 2019
A Generative Model for Sampling High-Performance and Diverse Weights for
  Neural Networks
A Generative Model for Sampling High-Performance and Diverse Weights for Neural Networks
Lior Deutsch
Erik Nijkamp
Yu Yang
118
16
0
07 May 2019
Ensemble Distribution Distillation
Ensemble Distribution DistillationInternational Conference on Learning Representations (ICLR), 2019
A. Malinin
Bruno Mlodozeniec
Mark Gales
UQCV
527
263
0
30 Apr 2019
Uniform convergence may be unable to explain generalization in deep
  learning
Uniform convergence may be unable to explain generalization in deep learningNeural Information Processing Systems (NeurIPS), 2019
Vaishnavh Nagarajan
J. Zico Kolter
MoMeAI4CE
436
336
0
13 Feb 2019
Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
Cyclical Stochastic Gradient MCMC for Bayesian Deep LearningInternational Conference on Learning Representations (ICLR), 2019
Ruqi Zhang
Chunyuan Li
Jianyi Zhang
Changyou Chen
A. Wilson
BDL
289
291
0
11 Feb 2019
A Simple Baseline for Bayesian Uncertainty in Deep Learning
A Simple Baseline for Bayesian Uncertainty in Deep Learning
Wesley J. Maddox
T. Garipov
Pavel Izmailov
Dmitry Vetrov
A. Wilson
BDLUQCV
753
913
0
07 Feb 2019
Asymmetric Valleys: Beyond Sharp and Flat Local Minima
Asymmetric Valleys: Beyond Sharp and Flat Local MinimaNeural Information Processing Systems (NeurIPS), 2019
Haowei He
Gao Huang
Yang Yuan
ODLMLT
271
158
0
02 Feb 2019
Loss Landscapes of Regularized Linear Autoencoders
Loss Landscapes of Regularized Linear Autoencoders
D. Kunin
Jonathan M. Bloom
A. Goeva
C. Seed
368
98
0
23 Jan 2019
On Connected Sublevel Sets in Deep Learning
On Connected Sublevel Sets in Deep Learning
Quynh N. Nguyen
332
106
0
22 Jan 2019
Enhancing Discrete Choice Models with Representation Learning
Enhancing Discrete Choice Models with Representation Learning
Brian Sifringer
Virginie Lurkin
Alexandre Alahi
66
12
0
23 Dec 2018
Projected BNNs: Avoiding weight-space pathologies by learning latent
  representations of neural network weights
Projected BNNs: Avoiding weight-space pathologies by learning latent representations of neural network weights
Melanie F. Pradier
Weiwei Pan
Jiayu Yao
S. Ghosh
Finale Doshi-velez
UQCVBDL
228
10
0
16 Nov 2018
A Closer Look at Deep Learning Heuristics: Learning rate restarts,
  Warmup and Distillation
A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
Akhilesh Deepak Gotmare
N. Keskar
Caiming Xiong
R. Socher
ODL
251
304
0
29 Oct 2018
Good Initializations of Variational Bayes for Deep Models
Good Initializations of Variational Bayes for Deep Models
Simone Rossi
Pietro Michiardi
Maurizio Filippone
BDL
332
23
0
18 Oct 2018
MotherNets: Rapid Deep Ensemble Learning
MotherNets: Rapid Deep Ensemble Learning
Abdul Wasay
Brian Hentschel
Yuze Liao
Sanyuan Chen
Stratos Idreos
174
39
0
12 Sep 2018
Make (Nearly) Every Neural Network Better: Generating Neural Network
  Ensembles by Weight Parameter Resampling
Make (Nearly) Every Neural Network Better: Generating Neural Network Ensembles by Weight Parameter Resampling
Jiayi Liu
S. Tripathi
Unmesh Kurup
Mohak Shah
UQCV
104
4
0
02 Jul 2018
Using Mode Connectivity for Loss Landscape Analysis
Using Mode Connectivity for Loss Landscape Analysis
Akhilesh Deepak Gotmare
N. Keskar
Caiming Xiong
R. Socher
174
29
0
18 Jun 2018
The global optimum of shallow neural network is attained by ridgelet
  transform
The global optimum of shallow neural network is attained by ridgelet transform
Sho Sonoda
Isao Ishikawa
Masahiro Ikeda
Kei Hagihara
Y. Sawano
Takuo Matsubara
Noboru Murata
169
1
0
19 May 2018
Averaging Weights Leads to Wider Optima and Better Generalization
Averaging Weights Leads to Wider Optima and Better GeneralizationConference on Uncertainty in Artificial Intelligence (UAI), 2018
Pavel Izmailov
Dmitrii Podoprikhin
T. Garipov
Dmitry Vetrov
A. Wilson
FedMLMoMe
649
1,890
0
14 Mar 2018
Variance Networks: When Expectation Does Not Meet Your Expectations
Variance Networks: When Expectation Does Not Meet Your ExpectationsInternational Conference on Learning Representations (ICLR), 2018
Kirill Neklyudov
Dmitry Molchanov
Arsenii Ashukha
Dmitry Vetrov
UQCV
366
24
0
10 Mar 2018
Essentially No Barriers in Neural Network Energy Landscape
Essentially No Barriers in Neural Network Energy LandscapeInternational Conference on Machine Learning (ICML), 2018
Felix Dräxler
K. Veschgini
M. Salmhofer
Fred Hamprecht
MoMe
579
487
0
02 Mar 2018
Generating Neural Networks with Neural Networks
Generating Neural Networks with Neural Networks
Lior Deutsch
311
22
0
06 Jan 2018
Previous
123...10119
Page 11 of 11
Pageof 11