Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.14102
Cited By
Exploring Mode Connectivity for Pre-trained Language Models
25 October 2022
Yujia Qin
Cheng Qian
Jing Yi
Weize Chen
Yankai Lin
Xu Han
Zhiyuan Liu
Maosong Sun
Jie Zhou
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring Mode Connectivity for Pre-trained Language Models"
20 / 20 papers shown
Title
Understanding Machine Unlearning Through the Lens of Mode Connectivity
Jiali Cheng
Hadi Amiri
MU
75
0
0
08 Apr 2025
Training-Free Model Merging for Multi-target Domain Adaptation
Wenyi Li
Huan-ang Gao
Mingju Gao
Beiwen Tian
Rong Zhi
Hao Zhao
MoMe
43
5
0
18 Jul 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Zhenyu Zhang
Tingwen Liu
Shuohuan Wang
Yu Sun
33
1
0
03 Jun 2024
Predicting the Impact of Model Expansion through the Minima Manifold: A Loss Landscape Perspective
Pranshu Malviya
Jerry Huang
Quentin Fournier
Sarath Chandar
54
0
0
24 May 2024
On the Emergence of Cross-Task Linearity in the Pretraining-Finetuning Paradigm
Zhanpeng Zhou
Zijun Chen
Yilan Chen
Bo-Wen Zhang
Junchi Yan
MoMe
19
9
0
06 Feb 2024
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
19
10
0
07 Dec 2023
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu
Yu Bowen
Haiyang Yu
Fei Huang
Yongbin Li
MoMe
28
266
0
06 Nov 2023
Proving Linear Mode Connectivity of Neural Networks via Optimal Transport
Damien Ferbach
Baptiste Goujaud
Gauthier Gidel
Aymeric Dieuleveut
MoMe
16
16
0
29 Oct 2023
Deep Model Fusion: A Survey
Weishi Li
Yong Peng
Miao Zhang
Liang Ding
Han Hu
Li Shen
FedML
MoMe
21
51
0
27 Sep 2023
Composing Parameter-Efficient Modules with Arithmetic Operations
Jinghan Zhang
Shiqi Chen
Junteng Liu
Junxian He
KELM
MoMe
11
107
0
26 Jun 2023
Lookaround Optimizer:
k
k
k
steps around, 1 step average
Jiangtao Zhang
Shunyu Liu
Jie Song
Tongtian Zhu
Zhenxing Xu
Mingli Song
MoMe
29
6
0
13 Jun 2023
Soft Merging of Experts with Adaptive Routing
Mohammed Muqeeth
Haokun Liu
Colin Raffel
MoMe
MoE
17
45
0
06 Jun 2023
Recyclable Tuning for Continual Pre-training
Yujia Qin
Cheng Qian
Xu Han
Yankai Lin
Huadong Wang
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
CLL
13
11
0
15 May 2023
Knowledge is a Region in Weight Space for Fine-tuned Language Models
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
19
49
0
09 Feb 2023
Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization
Alexandre Ramé
Kartik Ahuja
Jianyu Zhang
Matthieu Cord
Léon Bottou
David Lopez-Paz
MoMe
OODD
24
80
0
20 Dec 2022
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
16
52
0
02 Dec 2022
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,835
0
18 Apr 2021
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
267
1,808
0
14 Dec 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,584
0
21 Jan 2020
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
406
2,576
0
03 Sep 2019
1