Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.16789
Cited By
Continual Learning of Large Language Models: A Comprehensive Survey
25 April 2024
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Continual Learning of Large Language Models: A Comprehensive Survey"
50 / 52 papers shown
Title
Elastic Weight Consolidation for Full-Parameter Continual Pre-Training of Gemma2
Vytenis Šliogeris
Povilas Daniušis
Arturas Nakvosas
CLL
26
0
0
09 May 2025
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning
Jinpeng Chen
Runmin Cong
Yuzhi Zhao
Hongzheng Yang
Guangneng Hu
H. Ip
Sam Kwong
CLL
KELM
59
0
0
05 May 2025
Memorization and Knowledge Injection in Gated LLMs
Xu Pan
Ely Hahami
Zechen Zhang
H. Sompolinsky
KELM
CLL
RALM
101
0
0
30 Apr 2025
Enhanced Continual Learning of Vision-Language Models with Model Fusion
Haoyuan Gao
Zicong Zhang
Yuqi Wei
Linglan Zhao
Guilin Li
Y. Li
Linghe Kong
Weiran Huang
CLL
VLM
65
0
0
12 Mar 2025
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
Yibin Wang
H. Shi
Ligong Han
Dimitris N. Metaxas
Hao Wang
BDL
UQLM
93
6
0
28 Jan 2025
Boosting LLM Translation Skills without General Ability Loss via Rationale Distillation
Junhong Wu
Yang Zhao
Yangyifan Xu
Bing Liu
Chengqing Zong
CLL
22
1
0
17 Oct 2024
Scalable Multi-Domain Adaptation of Language Models using Modular Experts
Peter Schafhalter
Shun Liao
Yanqi Zhou
Chih-Kuan Yeh
Arun Kandoor
James Laudon
MoE
14
1
0
14 Oct 2024
MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
Yupeng Chen
Senmiao Wang
Zhihang Lin
Zhihang Lin
Yushun Zhang
Tian Ding
Ruoyu Sun
Ruoyu Sun
CLL
60
1
0
30 Jul 2024
Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions
Shumaila Javaid
R. A. Khalil
Nasir Saeed
Bin He
Mohamed-Slim Alouini
27
8
0
05 Jul 2024
Pretraining and Updating Language- and Domain-specific Large Language Model: A Case Study in Japanese Business Domain
Kosuke Takahashi
Takahiro Omi
Kosuke Arima
Tatsuya Ishigaki
23
0
0
12 Apr 2024
Foundation Model for Advancing Healthcare: Challenges, Opportunities, and Future Directions
Yuting He
Fuxiang Huang
Xinrui Jiang
Yuxiang Nie
Minghao Wang
Jiguang Wang
Hao Chen
LM&MA
AI4CE
71
26
0
04 Apr 2024
BLADE: Enhancing Black-box Large Language Models with Small Domain-Specific Models
Haitao Li
Qingyao Ai
Jia Chen
Qian Dong
Zhijing Wu
Yiqun Liu
Chong Chen
Qi Tian
AILaw
26
13
0
27 Mar 2024
Larimar: Large Language Models with Episodic Memory Control
Payel Das
Subhajit Chaudhury
Elliot Nelson
Igor Melnyk
Sarath Swaminathan
...
Vijil Chenthamarakshan
Jiří
Jirí Navrátil
Soham Dan
Pin-Yu Chen
CLL
KELM
29
5
0
18 Mar 2024
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
Jiazuo Yu
Yunzhi Zhuge
Lu Zhang
Ping Hu
Dong Wang
Huchuan Lu
You He
VLM
KELM
CLL
OODD
100
67
0
18 Mar 2024
CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning
Yukun Li
Guansong Pang
Wei Suo
Chenchen Jing
Yuling Xi
Lingqiao Liu
Hao Chen
Guoqiang Liang
Peng Wang
CLL
VLM
34
8
0
15 Mar 2024
Select and Distill: Selective Dual-Teacher Knowledge Transfer for Continual Learning on Vision-Language Models
Yu-Chu Yu
Chi-Pin Huang
Jr-Jen Chen
Kai-Po Chang
Yung-Hsuan Lai
Fu-En Yang
Yu-Chiang Frank Wang
CLL
VLM
20
7
0
14 Mar 2024
Take the Bull by the Horns: Hard Sample-Reweighted Continual Training Improves LLM Generalization
Xuxi Chen
Zhendong Wang
Daouda Sow
Junjie Yang
Tianlong Chen
Yingbin Liang
Mingyuan Zhou
Zhangyang Wang
25
5
0
22 Feb 2024
Me LLaMA: Foundation Large Language Models for Medical Applications
Qianqian Xie
Qingyu Chen
Aokun Chen
C.A.I. Peng
Yan Hu
...
Huan He
Lucila Ohno-Machido
Yonghui Wu
Hua Xu
Jiang Bian
LM&MA
AI4MH
62
3
0
20 Feb 2024
Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains
Junhong Shen
Neil Tenenholtz
James Hall
David Alvarez-Melis
Nicolò Fusi
40
20
0
06 Feb 2024
Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models
Weijiao Zhang
Jindong Han
Zhao Xu
Hang Ni
Hao Liu
Hui Xiong
Hui Xiong
AI4CE
77
14
0
30 Jan 2024
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
:
Xiao Bi
Deli Chen
Guanting Chen
...
Yao Zhao
Shangyan Zhou
Shunfeng Zhou
Qihao Zhu
Yuheng Zou
LRM
ALM
131
298
0
05 Jan 2024
Continual Learning with Low Rank Adaptation
Martin Wistuba
Prabhu Teja Sivaprasad
Lukas Balles
Giovanni Zappella
CLL
18
10
0
29 Nov 2023
Efficient Continual Pre-training for Building Domain Specific Large Language Models
Yong Xie
Karan Aggarwal
Aitzaz Ahmad
CLL
29
21
0
14 Nov 2023
Orthogonal Subspace Learning for Language Model Continual Learning
Xiao Wang
Tianze Chen
Qiming Ge
Han Xia
Rong Bao
Rui Zheng
Qi Zhang
Tao Gui
Xuanjing Huang
CLL
110
85
0
22 Oct 2023
IBCL: Zero-shot Model Generation for Task Trade-offs in Continual Learning
Pengyuan Lu
Michele Caprio
Eric Eaton
Insup Lee
VLM
47
3
0
04 Oct 2023
Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Yuexiang Zhai
Shengbang Tong
Xiao Li
Mu Cai
Qing Qu
Yong Jae Lee
Y. Ma
VLM
MLLM
CLL
66
75
0
19 Sep 2023
CodeGen2: Lessons for Training LLMs on Programming and Natural Languages
Erik Nijkamp
A. Ghobadzadeh
Caiming Xiong
Silvio Savarese
Yingbo Zhou
130
163
0
03 May 2023
PMC-LLaMA: Towards Building Open-source Language Models for Medicine
Chaoyi Wu
Weixiong Lin
Xiaoman Zhang
Ya-Qin Zhang
Yanfeng Wang
Weidi Xie
LM&MA
AI4MH
78
74
0
27 Apr 2023
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge
Yunxiang Li
Zihan Li
Kai Zhang
Ruilong Dan
Steven Jiang
You Zhang
LM&MA
AI4MH
114
366
0
24 Mar 2023
Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Zangwei Zheng
Mingyu Ma
Kai Wang
Ziheng Qin
Xiangyu Yue
Yang You
CLL
VLM
84
29
0
12 Mar 2023
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Mingxu Tao
Yansong Feng
Dongyan Zhao
CLL
KELM
14
10
0
02 Mar 2023
Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt Representation Drift in Continual Learning
F. Sarfraz
Elahe Arani
Bahram Zonooz
KELM
CLL
36
25
0
14 Feb 2023
BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
Junnan Li
Dongxu Li
Silvio Savarese
Steven C. H. Hoi
VLM
MLLM
244
4,186
0
30 Jan 2023
Continual Training of Language Models for Few-Shot Learning
Zixuan Ke
Haowei Lin
Yijia Shao
Hu Xu
Lei Shu
Bin Liu
KELM
BDL
CLL
83
33
0
11 Oct 2022
Calibrating Factual Knowledge in Pretrained Language Models
Qingxiu Dong
Damai Dai
Yifan Song
Jingjing Xu
Zhifang Sui
Lei Li
KELM
213
81
0
07 Oct 2022
CLIP model is an Efficient Continual Learner
Vishal G. Thengane
Salman Khan
Munawar Hayat
F. Khan
BDL
VLM
CLL
88
43
0
06 Oct 2022
Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering
Pan Lu
Swaroop Mishra
Tony Xia
Liang Qiu
Kai-Wei Chang
Song-Chun Zhu
Oyvind Tafjord
Peter Clark
A. Kalyan
ELM
ReLM
LRM
198
1,089
0
20 Sep 2022
SparCL: Sparse Continual Learning on the Edge
Zifeng Wang
Zheng Zhan
Yifan Gong
Geng Yuan
Wei Niu
T. Jian
Bin Ren
Stratis Ioannidis
Yanzhi Wang
Jennifer Dy
CLL
49
57
0
20 Sep 2022
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
132
116
0
24 May 2022
Enhancing Continual Learning with Global Prototypes: Counteracting Negative Representation Drift
Xueying Bai
Jinghuan Shang
Yifan Sun
Niranjan Balasubramanian
CLL
17
1
0
24 May 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
315
8,261
0
28 Jan 2022
Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning
Zixuan Ke
Bing-Quan Liu
Nianzu Ma
Hu Xu
Lei Shu
CLL
165
121
0
05 Dec 2021
Fast Model Editing at Scale
E. Mitchell
Charles Lin
Antoine Bosselut
Chelsea Finn
Christopher D. Manning
KELM
217
254
0
21 Oct 2021
LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5
Chengwei Qin
Shafiq R. Joty
CLL
150
96
0
14 Oct 2021
Time Masking for Temporal Language Models
Guy D. Rosin
Ido Guy
Kira Radinsky
CLL
KELM
157
55
0
12 Oct 2021
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation
Yue Wang
Weishi Wang
Shafiq R. Joty
S. Hoi
196
1,451
0
02 Sep 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
275
3,784
0
18 Apr 2021
The Pile: An 800GB Dataset of Diverse Text for Language Modeling
Leo Gao
Stella Biderman
Sid Black
Laurence Golding
Travis Hoppe
...
Horace He
Anish Thite
Noa Nabeshima
Shawn Presser
Connor Leahy
AIMat
236
1,508
0
31 Dec 2020
Adversarial Continual Learning
Sayna Ebrahimi
Franziska Meier
Roberto Calandra
Trevor Darrell
Marcus Rohrbach
CLL
VLM
135
195
0
21 Mar 2020
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1
2
Next