ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2009.02439
  4. Cited By
Optimizing Mode Connectivity via Neuron Alignment
v1v2 (latest)

Optimizing Mode Connectivity via Neuron Alignment

Neural Information Processing Systems (NeurIPS), 2020
5 September 2020
N. Joseph Tatro
Pin-Yu Chen
Payel Das
Igor Melnyk
P. Sattigeri
Rongjie Lai
    MoMe
ArXiv (abs)PDFHTML

Papers citing "Optimizing Mode Connectivity via Neuron Alignment"

50 / 75 papers shown
A Systematic Study of In-the-Wild Model Merging for Large Language Models
A Systematic Study of In-the-Wild Model Merging for Large Language Models
Oğuz Kağan Hitit
Leander Girrbach
Zeynep Akata
MoMe
369
3
0
26 Nov 2025
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
Can MLLMs Absorb Math Reasoning Abilities from LLMs as Free Lunch?
Yijie Hu
Zihao Zhou
Kaizhu Huang
Xiaowei Huang
Qiufeng Wang
LRM
156
3
0
16 Oct 2025
Rethinking Layer-wise Model Merging through Chain of Merges
Rethinking Layer-wise Model Merging through Chain of Merges
Pietro Buzzega
Riccardo Salami
Angelo Porrello
Simone Calderara
MoMeAI4CE
234
1
0
29 Aug 2025
Generalized Linear Mode Connectivity for Transformers
Generalized Linear Mode Connectivity for Transformers
Alexander Theus
Alessandro Cabodi
Sotiris Anagnostidis
Antonio Orvieto
Sidak Pal Singh
Valentina Boeva
444
2
0
28 Jun 2025
Circumventing Backdoor Space via Weight Symmetry
Circumventing Backdoor Space via Weight Symmetry
Jie Peng
Hongwei Yang
Jing Zhao
Hengji Dong
Hui He
Weizhe Zhang
Haoyu He
AAML
302
1
0
09 Jun 2025
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Decom-Renorm-Merge: Model Merging on the Right Space Improves Multitasking
Yuatyong Chaichana
Thanapat Trachu
Peerat Limkonchotiwat
Konpat Preechakul
Tirasan Khandhawit
Ekapol Chuangsuwanich
MoMe
666
1
0
29 May 2025
Understanding Mode Connectivity via Parameter Space Symmetry
Understanding Mode Connectivity via Parameter Space Symmetry
B. Zhao
Nima Dehmamy
Robin Walters
Rose Yu
680
11
0
29 May 2025
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan
Rohan Jain
Ekansh Sharma
Rahul Krishnan
Yani Andrew Ioannou
377
1
0
08 May 2025
Aggregation on Learnable Manifolds for Asynchronous Federated Optimization
Aggregation on Learnable Manifolds for Asynchronous Federated Optimization
Archie Licudi
A. Thakur
Soheila Molaei
Danielle Belgrave
David Clifton
FedML
400
0
0
18 Mar 2025
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
From Task-Specific Models to Unified Systems: A Review of Model Merging Approaches
Wei Ruan
Tianze Yang
Yimiao Zhou
Tianming Liu
Jin Lu
MoMe
462
10
0
13 Mar 2025
Paths and Ambient Spaces in Neural Loss Landscapes
Paths and Ambient Spaces in Neural Loss LandscapesInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2025
Daniel Dold
Julius Kobialka
Nicolai Palm
Emanuel Sommer
David Rügamer
Oliver Durr
AI4CE
512
3
0
05 Mar 2025
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Low-Rank and Sparse Model Merging for Multi-Lingual Speech Recognition and Translation
Qiuming Zhao
Guangzhi Sun
Chao Zhang
MoMeVLM
1.1K
5
0
24 Feb 2025
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language Models
Sens-Merging: Sensitivity-Guided Parameter Balancing for Merging Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Shuqi Liu
Han Wu
Bowei He
Xiongwei Han
Mingxuan Yuan
Linqi Song
MoMe
420
8
0
20 Feb 2025
Unveiling Mode Connectivity in Graph Neural Networks
Unveiling Mode Connectivity in Graph Neural Networks
Bingheng Li
Z. Chen
Haoyu Han
Shenglai Zeng
J. Liu
Shucheng Zhou
294
3
0
18 Feb 2025
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Beyond the Permutation Symmetry of Transformers: The Role of Rotation for Model Fusion
Binchi Zhang
Zaiyi Zheng
Zhengzhang Chen
Wenlin Yao
769
8
0
01 Feb 2025
Merging Feed-Forward Sublayers for Compressed Transformers
Merging Feed-Forward Sublayers for Compressed Transformers
Neha Verma
Kenton W. Murray
Kevin Duh
AI4CE
418
0
0
10 Jan 2025
Training-free Heterogeneous Model Merging
Training-free Heterogeneous Model Merging
Zhengqi Xu
Han Zheng
Jie Song
Li Sun
Weilong Dai
MoMe
548
3
0
03 Jan 2025
Non-Uniform Parameter-Wise Model Merging
Non-Uniform Parameter-Wise Model MergingBigData Congress [Services Society] (BSS), 2024
Albert Manuel Orozco Camacho
Stefan Horoi
Guy Wolf
Eugene Belilovsky
MoMeFedML
458
1
0
20 Dec 2024
MoD: A Distribution-Based Approach for Merging Large Language Models
MoD: A Distribution-Based Approach for Merging Large Language Models
Quy-Anh Dang
Chris Ngo
MoMeVLM
313
0
0
01 Nov 2024
Efficient and Effective Weight-Ensembling Mixture of Experts for
  Multi-Task Model Merging
Efficient and Effective Weight-Ensembling Mixture of Experts for Multi-Task Model Merging
Li Shen
Anke Tang
Enneng Yang
G. Guo
Yong Luo
Lefei Zhang
Xiaochun Cao
Di Lin
Dacheng Tao
MoMe
232
19
0
29 Oct 2024
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
A. Khan
Todd Nief
Nathaniel Hudson
Mansi Sakarvadia
Daniel Grzenda
Aswathy Ajith
Jordan Pettyjohn
Kyle Chard
Ian Foster
MoMe
292
1
0
16 Oct 2024
Exploring Model Kinship for Merging Large Language Models
Exploring Model Kinship for Merging Large Language Models
Yedi Hu
Yunzhi Yao
Ningyu Zhang
Shumin Deng
Ningyu Zhang
MoMe
504
1
0
16 Oct 2024
Revisiting Multi-Permutation Equivariance through the Lens of Irreducible Representations
Revisiting Multi-Permutation Equivariance through the Lens of Irreducible RepresentationsInternational Conference on Learning Representations (ICLR), 2024
Yonatan Sverdlov
Ido Springer
Nadav Dym
496
2
0
09 Oct 2024
What Matters for Model Merging at Scale?
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
296
49
0
04 Oct 2024
Parameter Competition Balancing for Model Merging
Parameter Competition Balancing for Model MergingNeural Information Processing Systems (NeurIPS), 2024
Guodong DU
Junlin Lee
Jing Li
Runhua Jiang
Yifei Guo
...
Hanting Liu
Sim Kuan Goh
Jing Li
Daojing He
Min Zhang
MoMe
277
58
0
03 Oct 2024
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Foldable SuperNets: Scalable Merging of Transformers with Different Initializations and Tasks
Edan Kinderman
Itay Hubara
Haggai Maron
Daniel Soudry
MoMe
408
4
0
02 Oct 2024
Weight Scope Alignment: A Frustratingly Easy Method for Model Merging
Weight Scope Alignment: A Frustratingly Easy Method for Model MergingEuropean Conference on Artificial Intelligence (ECAI), 2024
Yichu Xu
Xin-Chun Li
Le Gan
De-Chuan Zhan
MoMe
364
3
0
22 Aug 2024
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From
  Pre-Trained Foundation Models
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
Anke Tang
Li Shen
Yong Luo
Shuai Xie
Han Hu
Lefei Zhang
Di Lin
Dacheng Tao
MoMe
396
9
0
19 Aug 2024
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Computer Audition: From Task-Specific Machine Learning to Foundation Models
Andreas Triantafyllopoulos
Iosif Tsangko
Alexander Gebhard
A. Mesaros
Maria Sandsten
B. Schuller
456
10
0
22 Jul 2024
Training-Free Model Merging for Multi-target Domain Adaptation
Training-Free Model Merging for Multi-target Domain Adaptation
Wenyi Li
Huan-ang Gao
Mingju Gao
Beiwen Tian
Rong Zhi
Hao Zhao
MoMe
274
13
0
18 Jul 2024
Harmony in Diversity: Merging Neural Networks with Canonical Correlation
  Analysis
Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis
Stefan Horoi
Albert Manuel Orozco Camacho
Eugene Belilovsky
Guy Wolf
FedMLMoMe
275
12
0
07 Jul 2024
Investigating the Pre-Training Dynamics of In-Context Learning: Task
  Recognition vs. Task Learning
Investigating the Pre-Training Dynamics of In-Context Learning: Task Recognition vs. Task Learning
Xiaolei Wang
Xinyu Tang
Wayne Xin Zhao
Ji-Rong Wen
306
6
0
20 Jun 2024
Towards Efficient Pareto Set Approximation via Mixture of Experts Based
  Model Fusion
Towards Efficient Pareto Set Approximation via Mixture of Experts Based Model Fusion
Anke Tang
Li Shen
Yong Luo
Shiwei Liu
Han Hu
Di Lin
MoMe
240
13
0
14 Jun 2024
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof
Derek Lim
Moe Putterman
Robin Walters
Haggai Maron
Stefanie Jegelka
588
19
0
30 May 2024
Navigating the Safety Landscape: Measuring Risks in Finetuning Large
  Language Models
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models
Sheng-Hsuan Peng
Pin-Yu Chen
Matthew Hull
Duen Horng Chau
359
55
0
27 May 2024
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Visualizing, Rethinking, and Mining the Loss Landscape of Deep Neural Networks
Yichu Xu
Xin-Chun Li
Lan Li
De-Chuan Zhan
408
3
0
21 May 2024
Simultaneous linear connectivity of neural networks modulo permutation
Simultaneous linear connectivity of neural networks modulo permutation
Ekansh Sharma
Devin Kwok
Tom Denton
Daniel M. Roy
David Rolnick
Gintare Karolina Dziugaite
494
9
0
09 Apr 2024
Continual Learning with Weight Interpolation
Continual Learning with Weight Interpolation
Jkedrzej Kozal
Jan Wasilewski
Bartosz Krawczyk
Michal Wo'zniak
CLLMoMe
528
12
0
05 Apr 2024
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
Out-of-Distribution Detection via Deep Multi-Comprehension Ensemble
Chenhui Xu
Fuxun Yu
Zirui Xu
Nathan Inkawhich
Xiang Chen
OODD
308
14
0
24 Mar 2024
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Arcee's MergeKit: A Toolkit for Merging Large Language Models
Charles Goddard
Shamane Siriwardhana
Malikeh Ehghaghi
Luke Meyers
Vladimir Karpukhin
Brian Benedict
Mark McQuade
Jacob Solawetz
MoMeKELM
804
187
0
20 Mar 2024
Fisher Mask Nodes for Language Model Merging
Fisher Mask Nodes for Language Model MergingInternational Conference on Language Resources and Evaluation (LREC), 2024
Thennal D K
Ganesh Nathan
Suchithra M S
MoMeAI4CE
474
7
0
14 Mar 2024
Training-Free Pretrained Model Merging
Training-Free Pretrained Model Merging
Zhenxing Xu
Ke Yuan
Huiqiong Wang
Yong Wang
Weilong Dai
Mingli Song
MoMe
442
30
0
04 Mar 2024
Merging Text Transformer Models from Different Initializations
Merging Text Transformer Models from Different Initializations
Neha Verma
Maha Elbayad
MoMe
405
13
0
01 Mar 2024
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Training Neural Networks from Scratch with Parallel Low-Rank Adapters
Minyoung Huh
Brian Cheung
Jeremy Bernstein
Phillip Isola
Pulkit Agrawal
359
17
0
26 Feb 2024
Improving Model Fusion by Training-time Neuron Alignment with Fixed Neuron Anchors
Improving Model Fusion by Training-time Neuron Alignment with Fixed Neuron AnchorsIEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Zexi Li
Zhiqi Li
Jie Lin
Zhenyuan Zhang
Tao Lin
Chao Wu
Tao Lin
Chao Wu
464
5
0
02 Feb 2024
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang
Li Shen
Yong Luo
Nan Yin
Lefei Zhang
Dacheng Tao
MoMe
347
92
0
01 Feb 2024
Concrete Subspace Learning based Interference Elimination for Multi-task
  Model Fusion
Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion
Anke Tang
Li Shen
Yong Luo
Liang Ding
Han Hu
Bo Du
Dacheng Tao
MoMe
371
33
0
11 Dec 2023
Train ñ Trade: Foundations of Parameter Markets
Train ñ Trade: Foundations of Parameter MarketsNeural Information Processing Systems (NeurIPS), 2023
Tzu-Heng Huang
Harit Vishwakarma
Frederic Sala
AIFin
225
4
0
07 Dec 2023
Merging by Matching Models in Task Parameter Subspaces
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
366
24
0
07 Dec 2023
Proving Linear Mode Connectivity of Neural Networks via Optimal
  Transport
Proving Linear Mode Connectivity of Neural Networks via Optimal TransportInternational Conference on Artificial Intelligence and Statistics (AISTATS), 2023
Damien Ferbach
Baptiste Goujaud
Gauthier Gidel
Hadrien Hendrikx
MoMe
449
20
0
29 Oct 2023
12
Next
Page 1 of 2