ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2402.17400
  4. Cited By
Investigating Continual Pretraining in Large Language Models: Insights and Implications

Investigating Continual Pretraining in Large Language Models: Insights and Implications

27 February 2024
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
    CLL
    KELM
    LRM
ArXivPDFHTML

Papers citing "Investigating Continual Pretraining in Large Language Models: Insights and Implications"

19 / 19 papers shown
Title
Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources
Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources
Zihao Li
Shaoxiong Ji
Hengyu Luo
Jörg Tiedemann
CLL
33
0
0
05 Apr 2025
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them
Enhancing Domain-Specific Encoder Models with LLM-Generated Data: How to Leverage Ontologies, and How to Do Without Them
Marc Felix Brinner
Tarek Al Mustafa
Sina Zarrieß
29
0
0
27 Mar 2025
ORANSight-2.0: Foundational LLMs for O-RAN
Pranshav Gajjar
Vijay K. Shah
AI4TS
LRM
VLM
31
0
0
07 Mar 2025
Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Paul Janson
Vaibhav Singh
Paria Mehrbod
Adam Ibrahim
Irina Rish
Eugene Belilovsky
Benjamin Thérien
CLL
68
0
0
04 Mar 2025
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit
Oleg Filatov
Jan Ebert
Jiangtao Wang
Stefan Kesselheim
29
3
0
10 Jan 2025
How to Merge Your Multimodal Models Over Time?
How to Merge Your Multimodal Models Over Time?
Sebastian Dziadzio
Vishaal Udandarao
Karsten Roth
Ameya Prabhu
Zeynep Akata
Samuel Albanie
Matthias Bethge
MoMe
82
2
0
09 Dec 2024
Adapting Large Language Models to Log Analysis with Interpretable Domain
  Knowledge
Adapting Large Language Models to Log Analysis with Interpretable Domain Knowledge
Yuhe Ji
Yilun Liu
Feiyu Yao
Minggui He
Shimin Tao
...
Xinhua Yang
Weibin Meng
Yuming Xie
Boxing Chen
Hao Yang
65
2
0
02 Dec 2024
Continual Memorization of Factoids in Language Models
Continual Memorization of Factoids in Language Models
Howard Chen
Jiayi Geng
Adithya Bhaskar
Dan Friedman
Danqi Chen
KELM
29
0
0
11 Nov 2024
From Tokens to Words: On the Inner Lexicon of LLMs
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
32
12
0
08 Oct 2024
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning
  and Format Alignment
Mix-CPT: A Domain Adaptation Framework via Decoupling Knowledge Learning and Format Alignment
Jinhao Jiang
Junyi Li
Wayne Xin Zhao
Yang Song
Tao Zhang
Ji-Rong Wen
CLL
22
3
0
15 Jul 2024
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
Ashwinee Panda
Berivan Isik
Xiangyu Qi
Sanmi Koyejo
Tsachy Weissman
Prateek Mittal
MoMe
29
7
0
24 Jun 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Yikang Shen
23
12
0
21 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
20
14
0
10 Jun 2024
When LLMs Meet Cybersecurity: A Systematic Literature Review
When LLMs Meet Cybersecurity: A Systematic Literature Review
Jie Zhang
Haoyu Bu
Hui Wen
Yu Chen
Lun Li
Hongsong Zhu
24
36
0
06 May 2024
Towards Incremental Learning in Large Language Models: A Critical Review
Towards Incremental Learning in Large Language Models: A Critical Review
M. Jovanovic
Peter Voss
ELM
CLL
KELM
26
5
0
28 Apr 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
26
47
0
25 Apr 2024
Orthogonal Subspace Learning for Language Model Continual Learning
Orthogonal Subspace Learning for Language Model Continual Learning
Xiao Wang
Tianze Chen
Qiming Ge
Han Xia
Rong Bao
Rui Zheng
Qi Zhang
Tao Gui
Xuanjing Huang
CLL
110
85
0
22 Oct 2023
Fine-tuned Language Models are Continual Learners
Fine-tuned Language Models are Continual Learners
Thomas Scialom
Tuhin Chakrabarty
Smaranda Muresan
CLL
LRM
132
116
0
24 May 2022
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
220
3,054
0
23 Jan 2020
1