ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.12281
  4. Cited By
Lifelong Language Pretraining with Distribution-Specialized Experts

Lifelong Language Pretraining with Distribution-Specialized Experts

20 May 2023
Wuyang Chen
Yan-Quan Zhou
Nan Du
Yanping Huang
James Laudon
Z. Chen
Claire Cu
    KELM
ArXivPDFHTML

Papers citing "Lifelong Language Pretraining with Distribution-Specialized Experts"

39 / 39 papers shown
Title
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Task-Core Memory Management and Consolidation for Long-term Continual Learning
Tianyu Huai
Jie Zhou
Yuxuan Cai
Qin Chen
Wen Wu
Xingjiao Wu
Xipeng Qiu
Liang He
CLL
24
0
0
15 May 2025
Learning Dynamics in Continual Pre-Training for Large Language Models
Learning Dynamics in Continual Pre-Training for Large Language Models
Xingjin Wang
Howe Tissue
Lu Wang
Linjing Li
D. Zeng
CLL
29
0
0
12 May 2025
SEE: Continual Fine-tuning with Sequential Ensemble of Experts
SEE: Continual Fine-tuning with Sequential Ensemble of Experts
Zhilin Wang
Yafu Li
Xiaoye Qu
Yu Cheng
CLL
KELM
50
0
0
09 Apr 2025
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation
Saliency-Motion Guided Trunk-Collateral Network for Unsupervised Video Object Segmentation
Xiangyu Zheng
Wanyun Li
Songcheng He
Jianping Fan
Xiaoqiang Li
We Zhang
VOS
27
0
0
08 Apr 2025
Continual Cross-Modal Generalization
Continual Cross-Modal Generalization
Yan Xia
Hai Huang
Minghui Fang
Zhou Zhao
CLL
54
0
0
01 Apr 2025
LLaVA-CMoE: Towards Continual Mixture of Experts for Large Vision-Language Models
LLaVA-CMoE: Towards Continual Mixture of Experts for Large Vision-Language Models
Hengyuan Zhao
Ziqin Wang
Qixin Sun
Kaiyou Song
Yilin Li
Xiaolin Hu
Qingpei Guo
Si Liu
KELM
CLL
MoE
65
0
0
27 Mar 2025
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
A Comprehensive Survey of Mixture-of-Experts: Algorithms, Theory, and Applications
Siyuan Mu
Sen Lin
MoE
123
1
0
10 Mar 2025
Towards Differential Handling of Various Blur Regions for Accurate Image Deblurring
Towards Differential Handling of Various Blur Regions for Accurate Image Deblurring
Hu Gao
Depeng Dang
46
0
0
27 Feb 2025
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning
Yujie Feng
Xujia Wang
Zexin Lu
Shenghong Fu
Guangyuan Shi
Yongxin Xu
Yasha Wang
Philip S. Yu
Xu Chu
Xiao-Ming Wu
CLL
KELM
41
1
0
22 Feb 2025
Scalable Multi-Domain Adaptation of Language Models using Modular
  Experts
Scalable Multi-Domain Adaptation of Language Models using Modular Experts
Peter Schafhalter
Shun Liao
Yanqi Zhou
Chih-Kuan Yeh
Arun Kandoor
James Laudon
MoE
24
1
0
14 Oct 2024
CiMaTe: Citation Count Prediction Effectively Leveraging the Main Text
CiMaTe: Citation Count Prediction Effectively Leveraging the Main Text
Jun Hirako
Ryohei Sasano
Koichi Takeda
32
1
0
06 Oct 2024
Exploring the Benefit of Activation Sparsity in Pre-training
Exploring the Benefit of Activation Sparsity in Pre-training
Zhengyan Zhang
Chaojun Xiao
Qiujieli Qin
Yankai Lin
Zhiyuan Zeng
Xu Han
Zhiyuan Liu
Ruobing Xie
Maosong Sun
Jie Zhou
MoE
64
3
0
04 Oct 2024
Drift to Remember
Drift to Remember
Jin Du
X. Zhang
Hao Shen
Xun Xian
Ganghua Wang
Jiawei Zhang
Yuhong Yang
Na Li
Jia Liu
Jie Ding
CLL
16
0
0
21 Sep 2024
Continual learning with the neural tangent ensemble
Continual learning with the neural tangent ensemble
Ari S. Benjamin
Christian Pehle
Kyle Daruwalla
UQCV
62
0
0
30 Aug 2024
MoE-LPR: Multilingual Extension of Large Language Models through
  Mixture-of-Experts with Language Priors Routing
MoE-LPR: Multilingual Extension of Large Language Models through Mixture-of-Experts with Language Priors Routing
Hao Zhou
Zhijun Wang
Shujian Huang
Xin Huang
Xue Han
Junlan Feng
Chao Deng
Weihua Luo
Jiajun Chen
CLL
MoE
49
5
0
21 Aug 2024
Towards Efficient Large Language Models for Scientific Text: A Review
Towards Efficient Large Language Models for Scientific Text: A Review
H. To
Ming Liu
Guangyan Huang
35
1
0
20 Aug 2024
A Survey on Model MoErging: Recycling and Routing Among Specialized
  Experts for Collaborative Learning
A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
Prateek Yadav
Colin Raffel
Mohammed Muqeeth
Lucas Page-Caccia
Haokun Liu
Tianlong Chen
Mohit Bansal
Leshem Choshen
Alessandro Sordoni
MoMe
41
21
0
13 Aug 2024
Mixture of A Million Experts
Mixture of A Million Experts
Xu Owen He
MoE
36
25
0
04 Jul 2024
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang
Dong Shen
Chaoxiang Cai
Fan Yang
Size Li
Di Zhang
Xi Li
MoE
45
2
0
28 Jun 2024
MoE-CT: A Novel Approach For Large Language Models Training With
  Resistance To Catastrophic Forgetting
MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting
Tianhao Li
Shangjie Li
Binbin Xie
Deyi Xiong
Baosong Yang
CLL
50
3
0
25 Jun 2024
Efficient Continual Pre-training by Mitigating the Stability Gap
Efficient Continual Pre-training by Mitigating the Stability Gap
Yiduo Guo
Jie Fu
Huishuai Zhang
Dongyan Zhao
Yikang Shen
30
13
0
21 Jun 2024
Towards Lifelong Learning of Large Language Models: A Survey
Towards Lifelong Learning of Large Language Models: A Survey
Junhao Zheng
Shengjie Qiu
Chengming Shi
Qianli Ma
KELM
CLL
28
14
0
10 Jun 2024
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large
  Language Models
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
Haoran Que
Jiaheng Liu
Ge Zhang
Chenchen Zhang
Xingwei Qu
...
Jie Fu
Wenbo Su
Jiamang Wang
Lin Qu
Bo Zheng
CLL
36
13
0
03 Jun 2024
Auto-selected Knowledge Adapters for Lifelong Person Re-identification
Auto-selected Knowledge Adapters for Lifelong Person Re-identification
Xuelin Qian
Ruiqi Wu
Gong Cheng
Junwei Han
CLL
44
2
0
29 May 2024
Continual Learning of Large Language Models: A Comprehensive Survey
Continual Learning of Large Language Models: A Comprehensive Survey
Haizhou Shi
Zihao Xu
Hengyi Wang
Weiyi Qin
Wenyuan Wang
Yibin Wang
Zifeng Wang
Sayna Ebrahimi
Hao Wang
CLL
KELM
LRM
39
63
0
25 Apr 2024
Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning
Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning
Huiyi Wang
Haodong Lu
Lina Yao
Dong Gong
KELM
CLL
40
8
0
27 Mar 2024
Boosting Continual Learning of Vision-Language Models via
  Mixture-of-Experts Adapters
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
Jiazuo Yu
Yunzhi Zhuge
Lu Zhang
Ping Hu
Dong Wang
Huchuan Lu
You He
VLM
KELM
CLL
OODD
108
68
0
18 Mar 2024
Investigating Continual Pretraining in Large Language Models: Insights and Implications
Investigating Continual Pretraining in Large Language Models: Insights and Implications
cCaugatay Yildiz
Nishaanth Kanna Ravichandran
Prishruit Punia
Matthias Bethge
B. Ermiş
CLL
KELM
LRM
48
25
0
27 Feb 2024
Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character
  Role-Playing Agent
Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent
Xiaoyan Yu
Tongxu Luo
Yifan Wei
Fangyu Lei
Yiming Huang
Peng Hao
Liehuang Zhu
LLMAG
32
22
0
21 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Tianyi Zhou
KELM
VLM
42
100
0
20 Feb 2024
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
MoRAL: MoE Augmented LoRA for LLMs' Lifelong Learning
Shu Yang
Muhammad Asif Ali
Cheng-Long Wang
Lijie Hu
Di Wang
CLL
MoE
37
38
0
17 Feb 2024
Both Matter: Enhancing the Emotional Intelligence of Large Language
  Models without Compromising the General Intelligence
Both Matter: Enhancing the Emotional Intelligence of Large Language Models without Compromising the General Intelligence
Weixiang Zhao
Zhuojun Li
Shilong Wang
Yang Wang
Yulin Hu
Yanyan Zhao
Chen Wei
Bing Qin
22
4
0
15 Feb 2024
Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language
  Models
Carpe Diem: On the Evaluation of World Knowledge in Lifelong Language Models
Yujin Kim
Jaehong Yoon
Seonghyeon Ye
Sangmin Bae
Namgyu Ho
Sung Ju Hwang
Se-Young Yun
KELM
32
9
0
14 Nov 2023
How Do Large Language Models Capture the Ever-changing World Knowledge?
  A Review of Recent Advances
How Do Large Language Models Capture the Ever-changing World Knowledge? A Review of Recent Advances
Zihan Zhang
Meng Fang
Lingxi Chen
Mohammad-Reza Namazi-Rad
Jun Wang
KELM
19
21
0
11 Oct 2023
Continual Learning with Dirichlet Generative-based Rehearsal
Continual Learning with Dirichlet Generative-based Rehearsal
Min Zeng
Wei Xue
Qi-fei Liu
Yi-Ting Guo
CLL
BDL
24
5
0
13 Sep 2023
Mitigating the Alignment Tax of RLHF
Mitigating the Alignment Tax of RLHF
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Yuan Yao
Tong Zhang
MoMe
CLL
29
63
0
12 Sep 2023
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
106
0
24 Sep 2021
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
228
4,460
0
23 Jan 2020
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomáš Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
230
31,253
0
16 Jan 2013
1