Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2302.04863
Cited By
Knowledge is a Region in Weight Space for Fine-tuned Language Models
9 February 2023
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Knowledge is a Region in Weight Space for Fine-tuned Language Models"
38 / 38 papers shown
Title
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data
Fei Zhu
Zhaoxiang Zhang
OODD
UQCV
55
0
0
20 Apr 2025
Charting and Navigating Hugging Face's Model Atlas
Eliahu Horwitz
Nitzan Kurer
Jonathan Kahana
Liel Amar
Yedid Hoshen
31
0
0
13 Mar 2025
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object Identification
Vishnu Kabir Chhabra
Ding Zhu
Mohammad Mahdi Khalili
37
2
0
27 Feb 2025
Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models
Daiki Chijiwa
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Susumu Takeuchi
39
0
0
18 Feb 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
56
17
0
08 Jan 2025
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
28
1
0
07 Nov 2024
Local Contrastive Editing of Gender Stereotypes
Marlene Lutz
Rochelle Choenni
M. Strohmaier
Anne Lauscher
14
0
0
23 Oct 2024
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Mohit Bansal
Tsendsuren Munkhdalai
MoMe
44
12
0
04 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
16
5
0
26 Sep 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
72
1
0
30 Jul 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
47
13
0
24 Jun 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher
Ján Cegin
Róbert Belanec
Jakub Simko
Ivan Srba
M. Bieliková
35
1
0
18 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
42
48
0
17 Jun 2024
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
Feiteng Fang
Yuelin Bai
Shiwen Ni
Min Yang
Xiaojun Chen
Ruifeng Xu
AAML
RALM
27
29
0
31 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong-jin Liu
Ruiming Tang
KELM
33
4
0
29 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
45
13
0
22 May 2024
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
36
5
0
05 Apr 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
28
51
0
22 Feb 2024
Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond
Xinyu Wang
Hainiu Xu
Lin Gui
Yulan He
MoMe
AIFin
28
1
0
22 Feb 2024
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
Shahar Katz
Yonatan Belinkov
Mor Geva
Lior Wolf
44
9
1
20 Feb 2024
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning
Haeju Lee
Minchan Jeong
SeYoung Yun
Kee-Eung Kim
AAML
VPVLM
53
2
0
13 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
95
92
0
22 Jan 2024
A Comprehensive Study of Knowledge Editing for Large Language Models
Ningyu Zhang
Yunzhi Yao
Bo Tian
Peng Wang
Shumin Deng
...
Lei Liang
Zhiqiang Zhang
Xiao-Jun Zhu
Jun Zhou
Huajun Chen
KELM
26
76
0
02 Jan 2024
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
16
10
0
07 Dec 2023
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey
Garima Agrawal
Tharindu Kumarage
Zeyad Alghami
Huanmin Liu
27
80
0
14 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model Fusion
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELM
MoMe
13
10
0
13 Nov 2023
RSVP: Customer Intent Detection via Agent Response Contrastive and Generative Pre-Training
Yu-Chien Tang
Wei-Yao Wang
An-Zi Yen
Wenjie Peng
19
0
0
15 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
Pingzhi Li
Zhenyu (Allen) Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
15
33
0
02 Oct 2023
Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Dean Ninalga
17
2
0
24 Sep 2023
Derivative Free Weight-space Ensembling
Dean Ninalga
MoMe
13
0
0
07 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Max Zimmer
Christoph Spiegel
S. Pokutta
MoMe
28
14
0
29 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Mohit Bansal
MoMe
14
244
0
02 Jun 2023
ZipIt! Merging Models from Different Tasks without Training
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLM
MoMe
33
109
0
04 May 2023
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
14
52
0
02 Dec 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries
Samuel K. Ainsworth
J. Hayase
S. Srinivasa
MoMe
239
313
0
11 Sep 2022
Linear Connectivity Reveals Generalization Strategies
Jeevesh Juneja
Rachit Bansal
Kyunghyun Cho
João Sedoc
Naomi Saphra
232
45
0
24 May 2022
e-SNLI: Natural Language Inference with Natural Language Explanations
Oana-Maria Camburu
Tim Rocktaschel
Thomas Lukasiewicz
Phil Blunsom
LRM
252
618
0
04 Dec 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
294
6,927
0
20 Apr 2018
1