Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.08986
Cited By
Adapting a Language Model While Preserving its General Knowledge
21 January 2023
Zixuan Ke
Yijia Shao
Haowei Lin
Hu Xu
Lei Shu
Bin Liu
KELM
CLL
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Adapting a Language Model While Preserving its General Knowledge"
22 / 22 papers shown
Title
Fragile Mastery: Are Domain-Specific Trade-Offs Undermining On-Device Language Models?
Basab Jha
Firoj Paudel
37
0
0
16 Mar 2025
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
Jinhao Jiang
Junyi Li
Wayne Xin Zhao
Yang Song
Tao Zhang
Ji-Rong Wen
CLL
30
3
0
15 Jul 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Yikang Shen
30
13
0
21 Jun 2024
Word Matters: What Influences Domain Adaptation in Summarization?
Yinghao Li
Siyu Miao
Heyan Huang
Yang Gao
38
3
0
21 Jun 2024
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Taishi Nakamura
Mayank Mishra
Simone Tedeschi
Yekun Chai
Jason T Stillerman
...
Virendra Mehta
Matthew Blumberg
Victor May
Huu Nguyen
S. Pyysalo
LRM
23
7
0
30 Mar 2024
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
Zihao Wang
Anji Liu
Haowei Lin
Jiaqi Li
Xiaojian Ma
Yitao Liang
ReLM
RALM
LRM
85
47
0
08 Mar 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
46
25
0
27 Feb 2024
Balancing the Causal Effects in Class-Incremental Learning
Junhao Zheng
Ruiyan Wang
Chongzhi Zhang
Hu Feng
Qianli Ma
CML
CLL
31
0
0
15 Feb 2024
Selecting Large Language Model to Fine-tune via Rectified Scaling Law
Haowei Lin
Baizhou Huang
Haotian Ye
Qinyu Chen
Zihao Wang
Sujian Li
Jianzhu Ma
Xiaojun Wan
James Y. Zou
Yitao Liang
82
20
0
04 Feb 2024
Continual Learning: Applications and the Road Forward
Eli Verwimp
Rahaf Aljundi
Shai Ben-David
Matthias Bethge
Andrea Cossu
...
J. Weijer
Bing Liu
Vincenzo Lomonaco
Tinne Tuytelaars
Gido M. van de Ven
CLL
32
44
0
20 Nov 2023
Efficient Continual Pre-training for Building Domain Specific Large Language Models
Yong Xie
Karan Aggarwal
Aitzaz Ahmad
CLL
29
21
0
14 Nov 2023
Towards Anytime Fine-tuning: Continually Pre-trained Language Models with Hypernetwork Prompt
Gangwei Jiang
Caigao Jiang
Siqiao Xue
James Y. Zhang
Junqing Zhou
Defu Lian
Ying Wei
VLM
32
7
0
19 Oct 2023
Continual Pre-Training of Large Language Models: How to (re)warm your model?
Kshitij Gupta
Benjamin Thérien
Adam Ibrahim
Mats L. Richter
Quentin G. Anthony
Eugene Belilovsky
Irina Rish
Timothée Lesort
KELM
24
99
0
08 Aug 2023
Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories
Shizhe Diao
Tianyang Xu
Ruijia Xu
Jiawei Wang
Tong Zhang
MoE
AI4CE
11
36
0
08 Jun 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
24
4
0
23 May 2023
Continual Pre-training of Language Models
Zixuan Ke
Yijia Shao
Haowei Lin
Tatsuya Konishi
Gyuhak Kim
Bin Liu
CLL
KELM
20
122
0
07 Feb 2023
Continual Training of Language Models for Few-Shot Learning
Zixuan Ke
Haowei Lin
Yijia Shao
Hu Xu
Lei Shu
Bin Liu
KELM
BDL
CLL
85
34
0
11 Oct 2022
Achieving Forgetting Prevention and Knowledge Transfer in Continual Learning
Zixuan Ke
Bing-Quan Liu
Nianzu Ma
Hu Xu
Lei Shu
CLL
181
122
0
05 Dec 2021
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning
Yixuan Su
Fangyu Liu
Zaiqiao Meng
Tian Lan
Lei Shu
Ehsan Shareghi
Nigel Collier
134
57
0
07 Nov 2021
The Power of Scale for Parameter-Efficient Prompt Tuning
Brian Lester
Rami Al-Rfou
Noah Constant
VPVLM
280
3,844
0
18 Apr 2021
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
148
345
0
23 Jul 2020
A Mutual Information Maximization Perspective of Language Representation Learning
Lingpeng Kong
Cyprien de Masson dÁutume
Wang Ling
Lei Yu
Zihang Dai
Dani Yogatama
SSL
212
165
0
18 Oct 2019
1