Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2402.17400
Cited By
Investigating Continual Pretraining in Large Language Models: Insights and Implications
27 February 2024
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Investigating Continual Pretraining in Large Language Models: Insights and Implications"
19 / 19 papers shown
Title
Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources
Zihao Li
Shaoxiong Ji
Hengyu Luo
Jörg Tiedemann
CLL
33
0
0
05 Apr 2025
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them
Marc Felix Brinner
Tarek Al Mustafa
Sina Zarrieß
29
0
0
27 Mar 2025
ORANSight-2.0: Foundational LLMs for O-RAN
Pranshav Gajjar
Vijay K. Shah
AI4TS
LRM
VLM
31
0
0
07 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
68
0
0
04 Mar 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
29
3
0
10 Jan 2025
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
82
2
0
09 Dec 2024
Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge
Yuhe Ji
Yilun Liu
Feiyu Yao
Minggui He
Shimin Tao
...
Xinhua Yang
Weibin Meng
Yuming Xie
Boxing Chen
Hao Yang
65
2
0
02 Dec 2024
Continual Memorization of Factoids in Language Models
Howard Chen
Jiayi Geng
Adithya Bhaskar
Dan Friedman
Danqi Chen
KELM
29
0
0
11 Nov 2024
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
32
12
0
08 Oct 2024
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
Jinhao Jiang
Junyi Li
Wayne Xin Zhao
Yang Song
Tao Zhang
Ji-Rong Wen
CLL
22
3
0
15 Jul 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
29
7
0
24 Jun 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Yikang Shen
23
12
0
21 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
20
14
0
10 Jun 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
24
36
0
06 May 2024
Towards Incremental Learning in Large Language Models: A Critical Review
M. Jovanovic
Peter Voss
ELM
CLL
KELM
26
5
0
28 Apr 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
26
47
0
25 Apr 2024
Orthogonal Subspace Learning for Language Model Continual Learning
Xiao Wang
Tianze Chen
Qiming Ge
Han Xia
Rong Bao
Rui Zheng
Qi Zhang
Tao Gui
Xuanjing Huang
CLL
110
85
0
22 Oct 2023
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
132
116
0
24 May 2022
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1