ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2302.04863
  4. Cited By
Knowledge is a Region in Weight Space for Fine-tuned Language Models
v1v2v3 (latest)

Knowledge is a Region in Weight Space for Fine-tuned Language Models

Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023
9 February 2023
Almog Gueta
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
ArXiv (abs)PDFHTMLGithub

Papers citing "Knowledge is a Region in Weight Space for Fine-tuned Language Models"

38 / 38 papers shown
Learning to Interpret Weight Differences in Language Models
Learning to Interpret Weight Differences in Language Models
Avichal Goel
Yoon Kim
Nir Shavit
T. T. Wang
MILM
295
4
0
06 Oct 2025
SafeConstellations: Mitigating Over-Refusals in LLMs Through Task-Aware Representation Steering
SafeConstellations: Mitigating Over-Refusals in LLMs Through Task-Aware Representation Steering
Utsav Maskey
Sumit Yadav
Mark Dras
Usman Naseem
LLMSV
299
3
0
15 Aug 2025
Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers
Dynamic Weight Grafting: Localizing Finetuned Factual Knowledge in Transformers
Todd Nief
David Reber
Sean Richardson
Ari Holtzman
KELM
239
0
0
25 Jun 2025
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data
Fei Zhu
Zhaoxiang Zhang
OODDUQCV
436
1
0
20 Apr 2025
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object Identification
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object IdentificationNorth American Chapter of the Association for Computational Linguistics (NAACL), 2025
Vishnu Kabir Chhabra
Ding Zhu
Mohammad Mahdi Khalili
414
5
0
27 Feb 2025
Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models
Portable Reward Tuning: Towards Reusable Fine-Tuning across Different Pretrained Models
Daiki Chijiwa
Taku Hasegawa
Kyosuke Nishida
Kuniko Saito
Susumu Takeuchi
379
1
0
18 Feb 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedMLMoMe
428
35
0
08 Jan 2025
Reversed Attention: On The Gradient Descent Of Attention Layers In GPT
Reversed Attention: On The Gradient Descent Of Attention Layers In GPT
Shahar Katz
Lior Wolf
178
2
0
22 Dec 2024
Gradient Localization Improves Lifelong Pretraining of Language Models
Gradient Localization Improves Lifelong Pretraining of Language ModelsConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
335
4
0
07 Nov 2024
Local Contrastive Editing of Gender Stereotypes
Local Contrastive Editing of Gender StereotypesConference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Marlene Lutz
Rochelle Choenni
M. Strohmaier
Anne Lauscher
385
2
0
23 Oct 2024
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
Deep Model Merging: The Sister of Neural Network Interpretability -- A Survey
A. Khan
Todd Nief
Nathaniel Hudson
Mansi Sakarvadia
Daniel Grzenda
Aswathy Ajith
Jordan Pettyjohn
Kyle Chard
Ian Foster
MoMe
304
1
0
16 Oct 2024
What Matters for Model Merging at Scale?
What Matters for Model Merging at Scale?
Prateek Yadav
Tu Vu
Jonathan Lai
Alexandra Chronopoulou
Manaal Faruqui
Joey Tianyi Zhou
Tsendsuren Munkhdalai
MoMe
300
49
0
04 Oct 2024
Realistic Evaluation of Model Merging for Compositional Generalization
Realistic Evaluation of Model Merging for Compositional Generalization
Derek Tam
Yash Kant
Brian Lester
Igor Gilitschenski
Colin Raffel
MoMe
325
15
0
26 Sep 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Yushun Zhang
Zhihang Lin
Haozhe Zhang
Tian Ding
Tian Ding
Ruoyu Sun
CLL
469
8
0
30 Jul 2024
WARP: On the Benefits of Weight Averaged Rewarded Policies
WARP: On the Benefits of Weight Averaged Rewarded Policies
Alexandre Ramé
Johan Ferret
Nino Vieillard
Robert Dadashi
Léonard Hussenot
Pierre-Louis Cedoz
Pier Giuseppe Sessa
Sertan Girgin
Arthur Douillard
Olivier Bachem
401
39
0
24 Jun 2024
Fighting Randomness with Randomness: Mitigating Optimisation Instability
  of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Fighting Randomness with Randomness: Mitigating Optimisation Instability of Fine-Tuning using Delayed Ensemble and Noisy Interpolation
Branislav Pecher
Ján Cegin
Róbert Belanec
Jakub Simko
Ivan Srba
Maria Bielikova
248
2
0
18 Jun 2024
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Twin-Merging: Dynamic Integration of Modular Expertise in Model Merging
Zhenyi Lu
Chenghao Fan
Wei Wei
Xiaoye Qu
Dangyang Chen
Yu Cheng
MoMe
319
109
0
17 Jun 2024
Enhancing Noise Robustness of Retrieval-Augmented Language Models with
  Adaptive Adversarial Training
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
Feiteng Fang
Yuelin Bai
Shiwen Ni
Min Yang
Xiaojun Chen
Ruifeng Xu
AAMLRALM
410
88
0
31 May 2024
Evaluating the External and Parametric Knowledge Fusion of Large
  Language Models
Evaluating the External and Parametric Knowledge Fusion of Large Language Models
Hao Zhang
Yuyang Zhang
Xiaoguang Li
Wenxuan Shi
Haonan Xu
...
Yasheng Wang
Lifeng Shang
Qun Liu
Yong Liu
Ruiming Tang
KELM
301
8
0
29 May 2024
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via
  Alignment Tax Reduction
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction
Tingchen Fu
Deng Cai
Lemao Liu
Shuming Shi
Rui Yan
MoMe
440
20
0
22 May 2024
Lossless and Near-Lossless Compression for Foundation Models
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
415
13
0
05 Apr 2024
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity
  Tracking
Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking
Nikhil Prakash
Tamar Rott Shaham
Tal Haklay
Yonatan Belinkov
David Bau
383
107
0
22 Feb 2024
Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap
  for Prompt-Based Large Language Models and Beyond
Towards Unified Task Embeddings Across Multiple Models: Bridging the Gap for Prompt-Based Large Language Models and Beyond
Xinyu Wang
Hainiu Xu
Lin Gui
Yulan He
MoMeAIFin
366
2
0
22 Feb 2024
Backward Lens: Projecting Language Model Gradients into the Vocabulary
  Space
Backward Lens: Projecting Language Model Gradients into the Vocabulary Space
Shahar Katz
Yonatan Belinkov
Mor Geva
Lior Wolf
330
29
1
20 Feb 2024
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning
Bayesian Multi-Task Transfer Learning for Soft Prompt Tuning
Haeju Lee
Minchan Jeong
SeYoung Yun
Kee-Eung Kim
AAMLVPVLM
284
5
0
13 Feb 2024
WARM: On the Benefits of Weight Averaged Reward Models
WARM: On the Benefits of Weight Averaged Reward ModelsInternational Conference on Machine Learning (ICML), 2024
Alexandre Ramé
Nino Vieillard
Léonard Hussenot
Robert Dadashi
Geoffrey Cideron
Olivier Bachem
Johan Ferret
468
140
0
22 Jan 2024
A Comprehensive Study of Knowledge Editing for Large Language Models
A Comprehensive Study of Knowledge Editing for Large Language Models
Ningyu Zhang
Yunzhi Yao
Bo Tian
Peng Wang
Shumin Deng
...
Lei Liang
Qing Cui
Xiao-Jun Zhu
Jun Zhou
Huajun Chen
KELM
651
144
0
02 Jan 2024
Merging by Matching Models in Task Parameter Subspaces
Merging by Matching Models in Task Parameter Subspaces
Derek Tam
Mohit Bansal
Colin Raffel
MoMe
374
27
0
07 Dec 2023
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A Survey
Can Knowledge Graphs Reduce Hallucinations in LLMs? : A SurveyNorth American Chapter of the Association for Computational Linguistics (NAACL), 2023
Garima Agrawal
Tharindu Kumarage
Zeyad Alghami
Huanmin Liu
440
183
0
14 Nov 2023
Fuse to Forget: Bias Reduction and Selective Memorization through Model
  Fusion
Fuse to Forget: Bias Reduction and Selective Memorization through Model FusionConference on Empirical Methods in Natural Language Processing (EMNLP), 2023
Kerem Zaman
Leshem Choshen
Shashank Srivastava
KELMMoMe
325
16
0
13 Nov 2023
RSVP: Customer Intent Detection via Agent Response Contrastive and
  Generative Pre-Training
RSVP: Customer Intent Detection via Agent Response Contrastive and Generative Pre-Training
Yu-Chien Tang
Wei-Yao Wang
An-Zi Yen
Wenjie Peng
206
2
0
15 Oct 2023
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its
  Routing Policy
Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyInternational Conference on Learning Representations (ICLR), 2023
Pingzhi Li
Zhenyu Zhang
Prateek Yadav
Yi-Lin Sung
Yu Cheng
Mohit Bansal
Tianlong Chen
MoMe
342
93
0
02 Oct 2023
Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia
  Classifiers with a Multilingual Understanding
Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Dean Ninalga
297
2
0
24 Sep 2023
Derivative Free Weight-space Ensembling
Derivative Free Weight-space Ensembling
Dean Ninalga
MoMe
260
0
0
07 Jul 2023
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Sparse Model Soups: A Recipe for Improved Pruning via Model AveragingInternational Conference on Learning Representations (ICLR), 2023
Max Zimmer
Christoph Spiegel
Sebastian Pokutta
MoMe
548
22
0
29 Jun 2023
TIES-Merging: Resolving Interference When Merging Models
TIES-Merging: Resolving Interference When Merging ModelsNeural Information Processing Systems (NeurIPS), 2023
Prateek Yadav
Derek Tam
Leshem Choshen
Colin Raffel
Joey Tianyi Zhou
MoMe
470
640
0
02 Jun 2023
ZipIt! Merging Models from Different Tasks without Training
ZipIt! Merging Models from Different Tasks without TrainingInternational Conference on Learning Representations (ICLR), 2023
George Stoica
Daniel Bolya
J. Bjorner
Pratik Ramesh
Taylor N. Hearn
Judy Hoffman
VLMMoMe
515
183
0
04 May 2023
ColD Fusion: Collaborative Descent for Distributed Multitask Finetuning
ColD Fusion: Collaborative Descent for Distributed Multitask FinetuningAnnual Meeting of the Association for Computational Linguistics (ACL), 2022
Shachar Don-Yehiya
Elad Venezian
Colin Raffel
Noam Slonim
Yoav Katz
Leshem Choshen
MoMe
331
61
0
02 Dec 2022
1
Page 1 of 1