ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.08747
  4. Cited By
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

17 August 2023
Yun Luo
Zhen Yang
Fandong Meng
Yafu Li
Jie Zhou
Yue Zhang
    CLL
    KELM
ArXivPDFHTML

Papers citing "An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning"

50 / 61 papers shown
Title
PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
PosterO: Structuring Layout Trees to Enable Language Models in Generalized Content-Aware Layout Generation
HsiaoYuan Hsu
Yuxin Peng
21
0
0
06 May 2025
Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation
Knowledge Graphs for Enhancing Large Language Models in Entity Disambiguation
Gerard Pons
Besim Bilalli
Anna Queralt
38
1
0
05 May 2025
ConSens: Assessing context grounding in open-book question answering
ConSens: Assessing context grounding in open-book question answering
Ivan Vankov
Matyo Ivanov
Adriana Correia
Victor Botev
ELM
63
0
0
30 Apr 2025
Memorization and Knowledge Injection in Gated LLMs
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
104
0
0
30 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
MAP: Multi-user Personalization with Collaborative LLM-powered Agents
MAP: Multi-user Personalization with Collaborative LLM-powered Agents
Christine P. Lee
Jihye Choi
Bilge Mutlu
LLMAG
70
0
1
17 Mar 2025
Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder
Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder
Wonwoong Cho
Yan-Ying Chen
M. Klenk
David I. Inouye
Yanxia Zhang
DiffM
159
0
0
15 Mar 2025
MoFE: Mixture of Frozen Experts Architecture
Jean Seo
Jaeyoon Kim
Hyopil Shin
MoE
158
0
0
09 Mar 2025
Exploiting Edited Large Language Models as General Scientific Optimizers
Exploiting Edited Large Language Models as General Scientific Optimizers
Qitan Lv
T. Liu
H. Wang
38
0
0
08 Mar 2025
A General Framework to Enhance Fine-tuning-based LLM Unlearning
A General Framework to Enhance Fine-tuning-based LLM Unlearning
J. Ren
Zhenwei Dai
X. Tang
Hui Liu
Jingying Zeng
...
R. Goutam
Suhang Wang
Yue Xing
Qi He
Hui Liu
MU
163
1
0
25 Feb 2025
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes
Bryan R Christ
Zack Gottesman
Jonathan Kropko
Thomas Hartvigsen
LRM
51
2
0
20 Feb 2025
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang
Caigao Jiang
Zhaoyi Li
Siqiao Xue
Jun-ping Zhou
Linqi Song
Defu Lian
Yin Wei
CLL
MU
58
0
0
16 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
102
0
0
13 Feb 2025
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks
Jing Yang
Max Glockner
Anderson de Rezende Rocha
Iryna Gurevych
LRM
67
1
0
07 Feb 2025
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
MergeME: Model Merging Techniques for Homogeneous and Heterogeneous MoEs
Yuhang Zhou
Giannis Karamanolakis
Victor Soto
Anna Rumshisky
Mayank Kulkarni
Furong Huang
Wei Ai
Jianhua Lu
MoMe
104
0
0
03 Feb 2025
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Understanding and Mitigating Gender Bias in LLMs via Interpretable Neuron Editing
Zeping Yu
Sophia Ananiadou
KELM
43
1
0
24 Jan 2025
Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment
Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment
Melissa Kazemi Rad
Huy Nghiem
Andy Luo
Sahil Wadhwa
Mohammad Sorower
Stephen Rawls
AAML
93
2
0
22 Jan 2025
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
Keeping LLMs Aligned After Fine-tuning: The Crucial Role of Prompt Templates
Kaifeng Lyu
Haoyu Zhao
Xinran Gu
Dingli Yu
Anirudh Goyal
Sanjeev Arora
ALM
79
44
0
20 Jan 2025
CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics
CrisisSense-LLM: Instruction Fine-Tuned Large Language Model for Multi-label Social Media Text Classification in Disaster Informatics
Kai Yin
Chengkai Liu
Ali Mostafavi
Xia Hu
53
8
0
17 Jan 2025
Soup to go: mitigating forgetting during continual learning with model averaging
Soup to go: mitigating forgetting during continual learning with model averaging
Anat Kleiman
Gintare Karolina Dziugaite
Jonathan Frankle
Sham Kakade
Mansheej Paul
MoMe
CLL
KELM
51
0
0
09 Jan 2025
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
ConTrans: Weak-to-Strong Alignment Engineering via Concept Transplantation
Weilong Dong
Xinwei Wu
Renren Jin
Shaoyang Xu
Deyi Xiong
61
7
0
31 Dec 2024
On Domain-Specific Post-Training for Multimodal Large Language Models
On Domain-Specific Post-Training for Multimodal Large Language Models
Daixuan Cheng
Shaohan Huang
Ziyu Zhu
Xintong Zhang
Wayne Xin Zhao
Zhongzhi Luan
Bo Dai
Zhenliang Zhang
VLM
92
2
0
29 Nov 2024
Are Large Language Models Memorizing Bug Benchmarks?
Are Large Language Models Memorizing Bug Benchmarks?
Daniel Ramos
Claudia Mamede
Kush Jain
Paulo Canelas
Catarina Gamboa
Claire Le Goues
PILM
ELM
94
6
0
20 Nov 2024
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
RAG-DDR: Optimizing Retrieval-Augmented Generation Using Differentiable Data Rewards
Xinze Li
Sen Mei
Zhenghao Liu
Yukun Yan
Shuo Wang
...
H. Chen
Ge Yu
Zhiyuan Liu
Maosong Sun
Chenyan Xiong
44
6
0
17 Oct 2024
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Functional-level Uncertainty Quantification for Calibrated Fine-tuning on LLMs
Ruijia Niu
D. Wu
Rose Yu
Yi-An Ma
30
1
0
09 Oct 2024
How Much Can We Forget about Data Contamination?
How Much Can We Forget about Data Contamination?
Sebastian Bordt
Suraj Srinivas
Valentyn Boreiko
U. V. Luxburg
45
1
0
04 Oct 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
35
3
0
26 Sep 2024
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
CraftRTL: High-quality Synthetic Data Generation for Verilog Code Models with Correct-by-Construction Non-Textual Representations and Targeted Code Repair
Mingjie Liu
Yun-Da Tsai
Wenfei Zhou
Haoxing Ren
SyDa
3DV
45
6
0
19 Sep 2024
Recent Advances in Attack and Defense Approaches of Large Language
  Models
Recent Advances in Attack and Defense Approaches of Large Language Models
Jing Cui
Yishi Xu
Zhewei Huang
Shuchang Zhou
Jianbin Jiao
Junge Zhang
PILM
AAML
54
1
0
05 Sep 2024
Prompt Baking
Prompt Baking
Aman Bhargava
Cameron Witkowski
Alexander Detkov
Matt W. Thomson
AI4CE
35
0
0
04 Sep 2024
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised
  Vector-LoRA of the Foundation Model
DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model
Mona Sheikh Zeinoddin
Chiara Lena
Jiongqi Qu
Luca Carlini
Mattia Magro
...
E. Mazomenos
Daniel C. Alexander
Danail Stoyanov
Matthew J. Clarkson
Mobarakol Islam
31
1
0
30 Aug 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
72
1
0
30 Jul 2024
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing
Huanqian Wang
Yang Yue
Rui Lu
Jingxin Shi
Andrew Zhao
Shenzhi Wang
Shiji Song
Gao Huang
LM&Ro
KELM
49
6
0
11 Jul 2024
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong Wang
Zifeng Wang
Long Le
Huaixiu Steven Zheng
Swaroop Mishra
...
Anush Mattapalli
Ankur Taly
Jingbo Shang
Chen-Yu Lee
Tomas Pfister
RALM
80
31
0
11 Jul 2024
The impact of model size on catastrophic forgetting in Online Continual
  Learning
The impact of model size on catastrophic forgetting in Online Continual Learning
Eunhae Lee
CLL
26
0
0
28 Jun 2024
How Do Large Language Models Acquire Factual Knowledge During
  Pretraining?
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
Hoyeon Chang
Jinho Park
Seonghyeon Ye
Sohee Yang
Youngkyung Seo
Du-Seong Chang
Minjoon Seo
KELM
37
30
0
17 Jun 2024
HORAE: A Domain-Agnostic Language for Automated Service Regulation
HORAE: A Domain-Agnostic Language for Automated Service Regulation
Yutao Sun
Mingshuai Chen
Tiancheng Zhao
Kangjia Zhao
He Li
...
Zhongyi Wang
Liqiang Lu
Shuiguang Deng
Jianwei Yin
Jianwei Yin
134
0
0
06 Jun 2024
Disentangling and Mitigating the Impact of Task Similarity for Continual
  Learning
Disentangling and Mitigating the Impact of Task Similarity for Continual Learning
Naoki Hiratani
CLL
35
2
0
30 May 2024
KG-RAG: Bridging the Gap Between Knowledge and Creativity
KG-RAG: Bridging the Gap Between Knowledge and Creativity
Diego Sanmartin
RALM
44
36
0
20 May 2024
MBIAS: Mitigating Bias in Large Language Models While Retaining Context
MBIAS: Mitigating Bias in Large Language Models While Retaining Context
Shaina Raza
Ananya Raval
Veronica Chatrath
44
6
0
18 May 2024
HFT: Half Fine-Tuning for Large Language Models
HFT: Half Fine-Tuning for Large Language Models
Tingfeng Hui
Zhenyu Zhang
Shuohuan Wang
Weiran Xu
Yu Sun
Hua-Hong Wu
CLL
37
4
0
29 Apr 2024
CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment
CrossIn: An Efficient Instruction Tuning Approach for Cross-Lingual Knowledge Alignment
Geyu Lin
Bin Wang
Zhengyuan Liu
Nancy F. Chen
32
7
0
18 Apr 2024
AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees
AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees
William Fleshman
Aleem Khan
Marc Marone
Benjamin Van Durme
CLL
KELM
55
3
0
12 Apr 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye
Peiju Liu
Tianxiang Sun
Yunhua Zhou
Jun Zhan
Xipeng Qiu
37
62
0
25 Mar 2024
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
Boshi Wang
Hao Fang
Jason Eisner
Benjamin Van Durme
Yu-Chuan Su
CLL
29
7
0
07 Mar 2024
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning
  and Professional Question Answering Capability
JMLR: Joint Medical LLM and Retrieval Training for Enhancing Reasoning and Professional Question Answering Capability
Junda Wang
Zhichao Yang
Zonghai Yao
Hong-ye Yu
BDL
AI4MH
LRM
40
30
0
27 Feb 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
48
25
0
27 Feb 2024
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
Zhaorui Yang
Tianyu Pang
H. Feng
Han Wang
Wei Chen
Minfeng Zhu
Qian Liu
ALM
32
34
0
21 Feb 2024
RefuteBench: Evaluating Refuting Instruction-Following for Large
  Language Models
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Jianhao Yan
Yun Luo
Yue Zhang
ALM
LRM
38
6
0
21 Feb 2024
CoLLaVO: Crayon Large Language and Vision mOdel
CoLLaVO: Crayon Large Language and Vision mOdel
Byung-Kwan Lee
Beomchan Park
Chae Won Kim
Yonghyun Ro
VLM
MLLM
29
16
0
17 Feb 2024
12
Next