v1v2v3 (latest)

Editing Models with Task Arithmetic

International Conference on Learning Representations (ICLR), 2022

8 December 2022

ArXiv (abs)PDF HTML HuggingFace (7 upvotes)

Papers citing "Editing Models with Task Arithmetic"

50 / 525 papers shown

LLM Augmented LLMs: Expanding Capabilities through Composition

Sriram Ganapathy

245

04 Jan 2024

PILoRA: Prototype Guided Incremental LoRA for Federated Class-Incremental LearningEuropean Conference on Computer Vision (ECCV), 2024

284

04 Jan 2024

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and ToxicityInternational Conference on Machine Learning (ICML), 2024

Jonathan K. Kummerfeld

Amélie Reymond

324

158

03 Jan 2024

A Comprehensive Study of Knowledge Editing for Large Language Models

Ningyu Zhang

Yunzhi Yao

Bo Tian

Peng Wang

Shumin Deng

...

Lei Liang

Huajun Chen

493

126

02 Jan 2024

Partial Fine-Tuning: A Successor to Full Fine-Tuning for Vision Transformers

Peng Ye

Tao Chen

Wanli Ouyang

183

25 Dec 2023

Merging Vision Transformers from Different Tasks and Domains

Peng Ye

Tao Chen

Wanli Ouyang

221

25 Dec 2023

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Anirudh S. Sundar

Chao-Han Huck Yang

David M. Chan

Shalini Ghosh

Venkatesh Ravichandran

P. S. Nidadavolu

MoMe

294

22 Dec 2023

Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment

Haoran Xie

300

266

19 Dec 2023

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

Mohammad-Javad Davari

Eugene Belilovsky

MoMe

262

11 Dec 2023

Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion

Li Shen

Liang Ding

Bo Du

289

11 Dec 2023

Merging by Matching Models in Task Parameter Subspaces

Derek Tam

Mohit Bansal

Colin Raffel

MoMe

335

07 Dec 2023

Knowledge Unlearning for LLMs: Tasks, Methods, and Challenges

395

27 Nov 2023

ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization

240

22 Nov 2023

In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space SteeringInternational Conference on Machine Learning (ICML), 2023

Sheng Liu

Haotian Ye

Lei Xing

James Y. Zou

250

210

11 Nov 2023

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

Hang Zhao

607

207

09 Nov 2023

Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free LunchInternational Conference on Machine Learning (ICML), 2023

557

492

06 Nov 2023

A Survey on Knowledge Editing of Neural NetworksIEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023

411

30 Oct 2023

SoK: Memorization in General-Purpose Large Language Models

Valentin Hartmann

Anshuman Suri

Vincent Bindschaedler

327

24 Oct 2023

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding

Haoxiang Wang

Pavan Kumar Anasosalu Vasu

550

127

23 Oct 2023

Function Vectors in Large Language ModelsInternational Conference on Learning Representations (ICLR), 2023

324

183

23 Oct 2023

Equivariant Deep Weight Space Alignment

379

20 Oct 2023

Model Merging by Uncertainty-Based Gradient Matching

Mohammad Emtiyaz Khan

MoMe FedML

307

19 Oct 2023

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Luke Zettlemoyer

Yejin Choi

Prithviraj Ammanabrolu

MoMe

321

213

17 Oct 2023

Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric PerspectiveInternational Conference on Learning Representations (ICLR), 2023

360

17 Oct 2023

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formattingInternational Conference on Learning Representations (ICLR), 2023

Melanie Sclar

Yejin Choi

Yulia Tsvetkov

Alane Suhr

318

549

17 Oct 2023

Can We Edit Multimodal Large Language Models?Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023

Huajun Chen

Ningyu Zhang

MLLM

597

12 Oct 2023

Measuring Feature Sparsity in Language Models

Mingyang Deng

Lucas Tao

Joe Benton

234

11 Oct 2023

A Meta-Learning Perspective on Transformers for Causal Language ModelingAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

Xinbo Wu

Lav Varshney

304

09 Oct 2023

Establishing Trustworthiness: Rethinking Tasks and Model EvaluationConference on Empirical Methods in Natural Language Processing (EMNLP), 2023

188

09 Oct 2023

Uncovering hidden geometry in Transformers via disentangling position and context

Jiajun Song

Yiqiao Zhong

247

07 Oct 2023

Parameter Efficient Multi-task Model Fusion with Partial LinearizationInternational Conference on Learning Representations (ICLR), 2023

Li Shen

Bo Du

358

07 Oct 2023

AdaMerging: Adaptive Model Merging for Multi-Task LearningInternational Conference on Learning Representations (ICLR), 2023

Li Shen

325

181

04 Oct 2023

BYOM: Building Your Own Multi-Task Model For Free

295

03 Oct 2023

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing PolicyInternational Conference on Learning Representations (ICLR), 2023

Mohit Bansal

274

02 Oct 2023

ScaLearn: Simple and Highly Parameter-Efficient Task Transfer by Learning to ScaleAnnual Meeting of the Association for Computational Linguistics (ACL), 2023

342

02 Oct 2023

Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction AttacksInternational Conference on Learning Representations (ICLR), 2023

302

147

29 Sep 2023

Deep Model Fusion: A Survey

Liang Ding

Li Shen

299

27 Sep 2023

Knowledge Sanitization of Large Language Models

Yoichi Ishibashi

Hidetoshi Shimodaira

KELM

254

21 Sep 2023

Cognitive Mirage: A Review of Hallucinations in Large Language Models

376

112

13 Sep 2023

Circuit Breaking: Removing Model Behaviors with Targeted Ablation

299

12 Sep 2023

Emergent Linear Representations in World Models of Self-Supervised Sequence ModelsBlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (BlackboxNLP), 2023

311

247

02 Sep 2023

Fine-tuning can cripple your foundation model; preserving features may be the solution

385

25 Aug 2023

Overcoming Generic Knowledge Loss with Selective Parameter UpdateComputer Vision and Pattern Recognition (CVPR), 2023

377

23 Aug 2023

UnIVAL: Unified Model for Image, Video, Audio and Language Tasks

308

30 Jul 2023

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

467

291

25 Jul 2023

Layer-wise Linear Mode ConnectivityInternational Conference on Learning Representations (ICLR), 2023

Linara Adilova

Maksym Andriushchenko

519

13 Jul 2023

STG-MTL: Scalable Task Grouping for Multi-Task Learning Using Data Map

398

07 Jul 2023

ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models

470

01 Jul 2023

Composing Parameter-Efficient Modules with Arithmetic OperationsNeural Information Processing Systems (NeurIPS), 2023

339

152

26 Jun 2023

Rewarded soups: towards Pareto-optimal alignment by interpolating weights fine-tuned on diverse rewardsNeural Information Processing Systems (NeurIPS), 2023

360

202

07 Jun 2023