ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1802.10026
  4. Cited By
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
v1v2v3v4 (latest)

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Neural Information Processing Systems (NeurIPS), 2018
27 February 2018
T. Garipov
Pavel Izmailov
Dmitrii Podoprikhin
Dmitry Vetrov
A. Wilson
    UQCV
ArXiv (abs)PDFHTML

Papers citing "Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs"

50 / 547 papers shown
Title
Walking on the Fiber: A Simple Geometric Approximation for Bayesian Neural Networks
Alfredo Reichlin
Miguel Vasco
Danica Kragic
BDLUQCV
148
0
0
01 Dec 2025
Merge and Bound: Direct Manipulations on Weights for Class Incremental Learning
Merge and Bound: Direct Manipulations on Weights for Class Incremental Learning
Taehoon Kim
Donghwan Jang
Bohyung Han
CLLMoMe
185
0
0
26 Nov 2025
A Systematic Study of Model Merging Techniques in Large Language Models
A Systematic Study of Model Merging Techniques in Large Language Models
Oğuz Kağan Hitit
Leander Girrbach
Zeynep Akata
MoMe
249
0
0
26 Nov 2025
Merging without Forgetting: Continual Fusion of Task-Specific Models via Optimal Transport
Merging without Forgetting: Continual Fusion of Task-Specific Models via Optimal Transport
Z. Pan
Zhikang Chen
Ding Li
Min Zhang
Sen Cui
...
Yi Yang
Deheng Ye
Yu Zhang
T. Zhu
Tianling Ren
MoMeCLLFedML
275
0
0
24 Nov 2025
Subtract the Corruption: Training-Data-Free Corrective Machine Unlearning using Task Arithmetic
Subtract the Corruption: Training-Data-Free Corrective Machine Unlearning using Task Arithmetic
Mostafa Mozafari
F. Wani
Maria Sofia Bucarelli
Fabrizio Silvestri
124
0
0
24 Nov 2025
Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems
Shadows in the Code: Exploring the Risks and Defenses of LLM-based Multi-Agent Software Development Systems
Xiaoqing Wang
Keman Huang
Bin Liang
Hongyu Li
Xiaoyong Du
LLMAGAAML
168
0
0
23 Nov 2025
Escaping Optimization Stagnation: Taking Steps Beyond Task Arithmetic via Difference Vectors
Escaping Optimization Stagnation: Taking Steps Beyond Task Arithmetic via Difference Vectors
Jinping Wang
Zhiqiang Gao
Dinggen Zhang
Zhiwu Xie
MoMe
216
0
0
22 Nov 2025
Symmetry-Aware Graph Metanetwork Autoencoders: Model Merging through Parameter Canonicalization
Symmetry-Aware Graph Metanetwork Autoencoders: Model Merging through Parameter Canonicalization
Odysseas Boufalis
Jorge Carrasco-Pollo
Joshua Rosenthal
Eduardo Terres-Caballero
Alejandro García-Castellanos
117
0
0
16 Nov 2025
Uncertainty-Guided Selective Adaptation Enables Cross-Platform Predictive Fluorescence Microscopy
Uncertainty-Guided Selective Adaptation Enables Cross-Platform Predictive Fluorescence Microscopy
Kai-Wen K. Yang
Andrew Bai
Alexandra Bermudez
Yunqi Hong
Zoe Latham
Iris Sloan
Michael Liu
Vishrut Goyal
Cho-Jui Hsieh
Neil Y. C. Lin
OODMedIm
194
0
0
15 Nov 2025
Adaptive Stepsizing for Stochastic Gradient Langevin Dynamics in Bayesian Neural Networks
Adaptive Stepsizing for Stochastic Gradient Langevin Dynamics in Bayesian Neural Networks
Rajit Rajpal
Benedict Leimkuhler
Yuanhao Jiang
BDL
383
0
0
11 Nov 2025
Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
Davide Marincione
Donato Crisostomi
Roberto Dessi
Emanuele Rodolà
Emanuele Rossi
MoMeAI4CEVLM
299
1
0
07 Nov 2025
Linear Mode Connectivity under Data Shifts for Deep Ensembles of Image Classifiers
Linear Mode Connectivity under Data Shifts for Deep Ensembles of Image Classifiers
C. Hepburn
T. Zielke
A.P. Raulf
143
0
0
06 Nov 2025
Sharp Minima Can Generalize: A Loss Landscape Perspective On Data
Sharp Minima Can Generalize: A Loss Landscape Perspective On Data
Raymond Fan
Bryce Sandlund
Lin Myat Ko
80
0
0
06 Nov 2025
Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior
Shared Parameter Subspaces and Cross-Task Linearity in Emergently Misaligned Behavior
Daniel Aarao Reis Arturi
Eric Zhang
Andrew Ansah
Kevin Zhu
Ashwinee Panda
Aishwarya Balwani
65
0
0
03 Nov 2025
Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations
Keys in the Weights: Transformer Authentication Using Model-Bound Latent Representations
Ayşe S. Okatan
Mustafa İlhan Akbaş
Laxima Niure Kandel
Berker Peköz
112
0
0
02 Nov 2025
WeaveRec: An LLM-Based Cross-Domain Sequential Recommendation Framework with Model Merging
WeaveRec: An LLM-Based Cross-Domain Sequential Recommendation Framework with Model Merging
Min Hou
Xin Liu
Le Wu
Chenyi He
Hao Liu
Z. Li
Xin Li
Si Wei
MoMe
257
0
0
30 Oct 2025
Parameter Averaging in Link Prediction
Parameter Averaging in Link Prediction
Rupesh Sapkota
Caglar Demir
Arnab Sharma
A. Ngomo
MoMeFedML
264
0
0
29 Oct 2025
A Unified Perspective on Optimization in Machine Learning and Neuroscience: From Gradient Descent to Neural Adaptation
A Unified Perspective on Optimization in Machine Learning and Neuroscience: From Gradient Descent to Neural Adaptation
Jesus Garcia Fernandez
Nasir Ahmad
Marcel van Gerven
AI4CE
225
0
0
21 Oct 2025
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
Closing the Curvature Gap: Full Transformer Hessians and Their Implications for Scaling Laws
Egor Petrov
Nikita Kiselev
Vladislav Meshkov
Andrey Grabovoy
89
0
0
19 Oct 2025
MIN-Merging: Merge the Important Neurons for Model Merging
MIN-Merging: Merge the Important Neurons for Model Merging
Yunfei Liang
MoMe
485
0
0
18 Oct 2025
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
REAP the Experts: Why Pruning Prevails for One-Shot MoE compression
Mike Lasby
Ivan Lazarevich
Nish Sinnadurai
Sean Lie
Yani Andrew Ioannou
Vithursan Thangarasa
96
1
0
15 Oct 2025
Understanding the Effects of Domain Finetuning on LLMs
Understanding the Effects of Domain Finetuning on LLMs
Eshaan Tanwar
Deepak Nathani
William Yang Wang
Tanmoy Chakraborty
104
0
0
10 Oct 2025
Do We Really Need Permutations? Impact of Width Expansion on Linear Mode Connectivity
Do We Really Need Permutations? Impact of Width Expansion on Linear Mode Connectivity
Akira Ito
Masanori Yamada
Daiki Chijiwa
Atsutoshi Kumagai
MoMe
173
0
0
09 Oct 2025
Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models
Gradient-Sign Masking for Task Vector Transport Across Pre-Trained Models
Filippo Rinaldi
Aniello Panariello
Giacomo Salici
Fengyuan Liu
Marco Ciccone
Angelo Porrello
Simone Calderara
167
0
0
07 Oct 2025
Improving Clinical Dataset Condensation with Mode Connectivity-based Trajectory Surrogates
Improving Clinical Dataset Condensation with Mode Connectivity-based Trajectory Surrogates
Pafue Christy Nganjimi
A. Soltan
Danielle Belgrave
Lei A. Clifton
David Clifton
A. Thakur
DDAI4CE
204
0
0
07 Oct 2025
How does the optimizer implicitly bias the model merging loss landscape?
How does the optimizer implicitly bias the model merging loss landscape?
Chenxiang Zhang
Alexander Theus
Damien Teney
Antonio Orvieto
Jun Pang
S. Mauw
MoMe
174
1
0
06 Oct 2025
Categorical Invariants of Learning Dynamics
Categorical Invariants of Learning Dynamics
Abdulrahman Tamim
OOD
101
0
0
05 Oct 2025
Non-Linear Trajectory Modeling for Multi-Step Gradient Inversion Attacks in Federated Learning
Non-Linear Trajectory Modeling for Multi-Step Gradient Inversion Attacks in Federated Learning
Li Xia
Zheng Liu
Sili Huang
Wei Tang
Xuan Liu
Xuan Liu
AAML
108
1
0
26 Sep 2025
Closing the Oracle Gap: Increment Vector Transformation for Class Incremental Learning
Closing the Oracle Gap: Increment Vector Transformation for Class Incremental Learning
Zihuan Qiu
Yi Xu
Fanman Meng
Runtong Zhang
Linfeng XU
Qingbo Wu
Hongliang Li
CLLVLM
125
0
0
26 Sep 2025
The Thinking Spectrum: An Empirical Study of Tunable Reasoning in LLMs through Model Merging
The Thinking Spectrum: An Empirical Study of Tunable Reasoning in LLMs through Model Merging
Xiaochong Lan
Yu Zheng
Shiteng Cao
Yong Li
MoMeLRM
183
0
0
26 Sep 2025
Sharpness-Aware Minimization Can Hallucinate Minimizers
Sharpness-Aware Minimization Can Hallucinate Minimizers
Chanwoong Park
Uijeong Jang
Ernest K. Ryu
Insoon Yang
95
0
0
26 Sep 2025
Pre-training under infinite compute
Pre-training under infinite compute
Konwoo Kim
Suhas Kotha
Abigail Z. Jacobs
Tatsunori Hashimoto
200
1
0
18 Sep 2025
Exploring the Relationship between Brain Hemisphere States and Frequency Bands through Deep Learning Optimization Techniques
Exploring the Relationship between Brain Hemisphere States and Frequency Bands through Deep Learning Optimization Techniques
Robiul Islam
Dmitry I. Ignatov
Karl Kaberg
Roman Nabatchikov
97
0
0
17 Sep 2025
Harnessing Optimization Dynamics for Curvature-Informed Model Merging
Harnessing Optimization Dynamics for Curvature-Informed Model Merging
Pouria Mahdavinia
Hamed Mahdavi
Niloofar Mireshghallah
M. Mahdavi
MoMe
167
1
0
14 Sep 2025
Characterizing Fitness Landscape Structures in Prompt Engineering
Characterizing Fitness Landscape Structures in Prompt Engineering
Arend Hintze
84
0
0
04 Sep 2025
Distribution Shift Aware Neural Tabular Learning
Distribution Shift Aware Neural Tabular Learning
Wangyang Ying
Nanxu Gong
Dongjie Wang
Xinyuan Wang
Arun Vignesh Malarkkan
Vivek Gupta
Chandan K. Reddy
Yanjie Fu
OOD
198
3
0
27 Aug 2025
Learning from Oblivion: Predicting Knowledge Overflowed Weights via Retrodiction of Forgetting
Learning from Oblivion: Predicting Knowledge Overflowed Weights via Retrodiction of Forgetting
Jinhyeok Jang
Jaehong Kim
Jung Uk Kim
72
0
0
07 Aug 2025
Forgetting of task-specific knowledge in model merging-based continual learning
Forgetting of task-specific knowledge in model merging-based continual learning
Timm Hess
Gido M. van de Ven
Tinne Tuytelaars
CLLFedMLMoMeKELMVLM
158
0
0
31 Jul 2025
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown
Emile Anand
Sarah Liaw
208
0
0
21 Jul 2025
On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning
On the Surprising Effectiveness of a Single Global Merging in Decentralized Learning
Tongtian Zhu
Tianyu Zhang
Mingze Wang
Zhanpeng Zhou
Can Wang
FedMLMoMe
288
0
0
09 Jul 2025
Generalized Linear Mode Connectivity for Transformers
Generalized Linear Mode Connectivity for Transformers
Alexander Theus
Alessandro Cabodi
Sotiris Anagnostidis
Antonio Orvieto
Sidak Pal Singh
Valentina Boeva
221
1
0
28 Jun 2025
Subspace-Boosted Model Merging
Subspace-Boosted Model Merging
Ronald Skorobogat
Karsten Roth
Mariana-Iuliana Georgescu
MoMe
367
2
0
19 Jun 2025
Flat Channels to Infinity in Neural Loss Landscapes
Flat Channels to Infinity in Neural Loss Landscapes
Flavio Martinelli
Alexander Van Meegen
Berfin Simsek
W. Gerstner
Johanni Brea
253
2
0
17 Jun 2025
The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions
The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions
Devin Kwok
Gül Sena Altıntaş
Colin Raffel
David Rolnick
331
2
0
16 Jun 2025
Symmetry in Neural Network Parameter Spaces
Symmetry in Neural Network Parameter Spaces
Bo Zhao
Robin Walters
Rose Yu
317
6
0
16 Jun 2025
Circumventing Backdoor Space via Weight Symmetry
Circumventing Backdoor Space via Weight Symmetry
Jie Peng
Hongwei Yang
Jing Zhao
Hengji Dong
Hui He
Weizhe Zhang
Haoyu He
AAML
192
0
0
09 Jun 2025
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Yuatyong Chaichana
Thanapat Trachu
Peerat Limkonchotiwat
Konpat Preechakul
Tirasan Khandhawit
Ekapol Chuangsuwanich
MoMe
529
0
0
29 May 2025
Understanding Mode Connectivity via Parameter Space Symmetry
Understanding Mode Connectivity via Parameter Space Symmetry
B. Zhao
Nima Dehmamy
Robin Walters
Rose Yu
501
10
0
29 May 2025
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
SGD as Free Energy Minimization: A Thermodynamic View on Neural Network Training
Ildus Sadrtdinov
Ivan Klimov
E. Lobacheva
Dmitry Vetrov
176
1
0
29 May 2025
Benignity of loss landscape with weight decay requires both large overparametrization and initialization
Benignity of loss landscape with weight decay requires both large overparametrization and initialization
Etienne Boursier
Matthew Bowditch
Matthias Englert
R. Lazic
152
0
0
28 May 2025
1234...91011
Next