ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.10964
  4. Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
v1v2v3 (latest)

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks

23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
    VLMAI4CECLL
ArXiv (abs)PDFHTML

Papers citing "Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"

50 / 1,369 papers shown
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
Learning the Wrong Lessons: Syntactic-Domain Spurious Correlations in Language Models
Chantal Shaib
Vinith Suriyakumar
Levent Sagun
Byron C. Wallace
Elisa Kreiss
LRM
172
2
0
25 Sep 2025
Policy Compatible Skill Incremental Learning via Lazy Learning Interface
Policy Compatible Skill Incremental Learning via Lazy Learning Interface
Daehee Lee
Dongsu Lee
TaeYoon Kwack
Wonje Choi
Honguk Woo
CLL
274
0
0
24 Sep 2025
Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation
Embedding Domain Knowledge for Large Language Models via Reinforcement Learning from Augmented Generation
Chaojun Nie
Jun Zhou
G. Wang
Shisong Wud
Zichen Wang
166
0
0
24 Sep 2025
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Detoxifying Large Language Models via Autoregressive Reward Guided Representation Editing
Yisong Xiao
Aishan Liu
Siyuan Liang
Zonghao Ying
Xianglong Liu
Dacheng Tao
KELM
152
2
0
24 Sep 2025
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
Memory in Large Language Models: Mechanisms, Evaluation and Evolution
D. Zhang
Wendong Li
Kani Song
Jiaye Lu
Gang Li
Liuchun Yang
Sheng Li
KELM
209
1
0
23 Sep 2025
Visual Instruction Pretraining for Domain-Specific Foundation Models
Visual Instruction Pretraining for Domain-Specific Foundation Models
Yuxuan Li
Y. Zhang
Wenhao Tang
Yimian Dai
Ming-Ming Cheng
Xiang Li
Jian Yang
LRM
289
3
0
22 Sep 2025
PG-CE: A Progressive Generation Dataset with Constraint Enhancement for Controllable Text Generation
PG-CE: A Progressive Generation Dataset with Constraint Enhancement for Controllable Text Generation
Yan Zhuang
Yuan Sun
97
0
0
22 Sep 2025
Rethinking the Role of Text Complexity in Language Model Pretraining
Rethinking the Role of Text Complexity in Language Model Pretraining
Dan John Velasco
M. R
212
2
0
20 Sep 2025
Domain-Adaptive Pre-Training for Arabic Aspect-Based Sentiment Analysis: A Comparative Study of Domain Adaptation and Fine-Tuning Strategies
Domain-Adaptive Pre-Training for Arabic Aspect-Based Sentiment Analysis: A Comparative Study of Domain Adaptation and Fine-Tuning Strategies
Salha Alyami
A. Jamal
Areej M. Alhothali
109
0
0
20 Sep 2025
Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings
Optimizing Product Deduplication in E-Commerce with Multimodal Embeddings
Aysenur Kulunk
Berk Taskin
M. Furkan Eseoglu
H. Bahadir Sahin
156
0
0
19 Sep 2025
RoadMind: Towards a Geospatial AI Expert for Disaster Response
RoadMind: Towards a Geospatial AI Expert for Disaster Response
Ahmed El Fekih Zguir
Ferda Ofli
Muhammad Imran
LRM
77
1
0
18 Sep 2025
Deep learning and abstractive summarisation for radiological reports: an empirical study for adapting the PEGASUS models' family with scarce data
Deep learning and abstractive summarisation for radiological reports: an empirical study for adapting the PEGASUS models' family with scarce data
Claudio Benzoni
Martina Langhals
Martin Boeker
Luise Modersohn
Máté E. Maros
MedIm
107
0
0
18 Sep 2025
Boosting Data Utilization for Multilingual Dense Retrieval
Boosting Data Utilization for Multilingual Dense Retrieval
Chao Huang
Fengran Mo
Yufeng Chen
Changhao Guan
Zhenrui Yue
Xinyu Wang
Jinan Xu
Kaiyu Huang
140
2
0
11 Sep 2025
Towards EnergyGPT: A Large Language Model Specialized for the Energy Sector
Towards EnergyGPT: A Large Language Model Specialized for the Energy Sector
Amal Chebbi
Babajide Kolade
116
1
0
08 Sep 2025
Augmented Fine-Tuned LLMs for Enhanced Recruitment Automation
Augmented Fine-Tuned LLMs for Enhanced Recruitment Automation
Mohamed T. Younes
Omar Walid
Khaled Shaban
Ali Hamdi
Mai Hassan
36
0
0
07 Sep 2025
Hierarchical Section Matching Prediction (HSMP) BERT for Fine-Grained Extraction of Structured Data from Hebrew Free-Text Radiology Reports in Crohn's Disease
Hierarchical Section Matching Prediction (HSMP) BERT for Fine-Grained Extraction of Structured Data from Hebrew Free-Text Radiology Reports in Crohn's Disease
Zvi Badash
Hadas Ben-Atya
N. Gavrielov
L. Hazan
G. Focht
R. Cytter-Kuint
Talar Hagopian
Dan Turner
Moti Freiman
64
0
0
03 Sep 2025
Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation
Linear-Time Demonstration Selection for In-Context Learning via Gradient Estimation
Ziniu Zhang
Zhenshuo Zhang
Dongyue Li
Lu Wang
Jennifer Dy
Hongyang R. Zhang
134
4
0
27 Aug 2025
Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains
Active Domain Knowledge Acquisition with 100-Dollar Budget: Enhancing LLMs via Cost-Efficient, Expert-Involved Interaction in Sensitive Domains
Yang Wu
Raha Moraffah
Rujing Yao
Jinhong Yu
Zhimin Tao
Xiaozhong Liu
154
2
0
24 Aug 2025
ChatGPT-generated texts show authorship traits that identify them as non-human
ChatGPT-generated texts show authorship traits that identify them as non-human
Vittoria Dentella
Weihang Huang
Silvia Angela Mansi
Jack Grieve
Evelina Leivada
DeLMO
144
0
0
22 Aug 2025
Legal$Δ$: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain
LegalΔΔΔ: Enhancing Legal Reasoning in LLMs via Reinforcement Learning with Chain-of-Thought Guided Information Gain
Xin Dai
Buqiang Xu
Zhenghao Liu
Shi Yu
Huiyuan Xie
Xiaoyuan Yi
Kaiyan Zhang
Ge Yu
AILawELMLRM
188
0
0
17 Aug 2025
When Does Language Transfer Help? Sequential Fine-Tuning for Cross-Lingual Euphemism Detection
When Does Language Transfer Help? Sequential Fine-Tuning for Cross-Lingual Euphemism Detection
Julia Sammartino
Libby Barak
Jing Peng
Anna Feldman
84
0
0
15 Aug 2025
ALAS: Autonomous Learning Agent for Self-Updating Language Models
ALAS: Autonomous Learning Agent for Self-Updating Language Models
Dhruv Atreja
KELM
60
1
0
14 Aug 2025
AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design
AnalogSeeker: An Open-source Foundation Language Model for Analog Circuit Design
Zihao Chen
Ji Zhuang
Jinyi Shen
Xiaoyue Ke
Xinyi Yang
...
Zhenyu Xu
J. Huang
L. Shang
Xuan Zeng
Fan Yang
128
2
0
14 Aug 2025
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models
Jiaqi Cao
Jiarui Wang
Rubin Wei
Qipeng Guo
Kai Chen
Bowen Zhou
Zhouhan Lin
RALMCLL
174
2
0
13 Aug 2025
Effortless Vision-Language Model Specialization in Histopathology without Annotation
Effortless Vision-Language Model Specialization in Histopathology without Annotation
Jingna Qiu
Nishanth Jain
Jonas Ammeling
Marc Aubreville
Katharina Breininger
VLM
109
0
0
11 Aug 2025
Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning
Sensitivity of Stability: Theoretical & Empirical Analysis of Replicability for Adaptive Data Selection in Transfer Learning
Prabhav Singh
Jessica Sorrell
132
0
0
06 Aug 2025
Multidimensional classification of posts for online course discussion forum curation
Multidimensional classification of posts for online course discussion forum curation
Antonio Leandro Martins Candido
Jose Everardo Bessa Maia
78
0
0
05 Aug 2025
LLM-based IR-system for Bank Supervisors
LLM-based IR-system for Bank SupervisorsKnowledge-Based Systems (KBS), 2024
Ilias Aarab
117
2
0
04 Aug 2025
OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets
OpenMed NER: Open-Source, Domain-Adapted State-of-the-Art Transformers for Biomedical NER Across 12 Public Datasets
Maziyar Panahi
MedImVLMAI4CE
143
0
0
03 Aug 2025
Measuring Time-Series Dataset Similarity using Wasserstein Distance
Measuring Time-Series Dataset Similarity using Wasserstein Distance
Hongjie Chen
Akshay Mehra
Josh Kimball
Ryan Rossi
AI4TS
112
1
0
29 Jul 2025
Improving Community Detection in Academic Networks by Handling Publication Bias
Improving Community Detection in Academic Networks by Handling Publication Bias
Md Asaduzzaman Noor
John Sheppard
Jason Clark
87
0
0
28 Jul 2025
AI-Driven Generation of Old English: A Framework for Low-Resource Languages
AI-Driven Generation of Old English: A Framework for Low-Resource Languages
Rodrigo Gabriel Salazar Alva
Matías Nuñez
Cristian López
Javier Martín Arista
108
0
0
27 Jul 2025
AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
AQuilt: Weaving Logic and Self-Inspection into Low-Cost, High-Relevance Data Synthesis for Specialist LLMs
Xiaopeng Ke
Hexuan Deng
Xuebo Liu
Jun Rao
Zhenxi Song
Jun-chen Yu
Min Zhang
SyDa
227
1
0
24 Jul 2025
CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation
CCL-XCoT: An Efficient Cross-Lingual Knowledge Transfer Method for Mitigating Hallucination Generation
Weihua Zheng
Roy Ka-Wei Lee
Zhengyuan Liu
Kui Wu
AiTi Aw
Bowei Zou
HILMLRM
111
2
0
17 Jul 2025
Simplifying Traffic Anomaly Detection with Video Foundation Models
Simplifying Traffic Anomaly Detection with Video Foundation Models
Svetlana Orlova
Tommie Kerssies
B. B. Englert
Gijs Dubbelman
ViT
121
1
0
12 Jul 2025
ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining
ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining
Seonwu Kim
Yohan Na
Kihun Kim
Hanhee Cho
Geun Lim
Mintae Kim
Seongik Park
Ki Hyun Kim
Youngsub Han
Byoung-Ki Jeon
CLL
246
0
0
09 Jul 2025
Domain adaptation of large language models for geotechnical applications
Domain adaptation of large language models for geotechnical applications
Lei Fan
Fangxue Liu
Cheng Chen
AI4CE
241
1
0
08 Jul 2025
Collaborative Editable Model
Collaborative Editable Model
Kaiwen Tang
Aitong Wu
Yao Lu
Guangda Sun
KELM
189
0
0
17 Jun 2025
Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models
Just Go Parallel: Improving the Multilingual Capabilities of Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Muhammad Reza Qorib
Junyi Li
Hwee Tou Ng
LRM
251
4
0
16 Jun 2025
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and Mitigation
SoK: The Privacy Paradox of Large Language Models: Advancements, Privacy Risks, and MitigationACM Asia Conference on Computer and Communications Security (AsiaCCS), 2025
Yashothara Shanmugarasa
Ming Ding
M. Chamikara
Thierry Rakotoarivelo
PILMAILaw
431
10
0
15 Jun 2025
GeistBERT: Breathing Life into German NLP
GeistBERT: Breathing Life into German NLP
Raphael Scheible-Schmitt
Johann Frei
VLM
383
3
0
13 Jun 2025
Curriculum-Guided Layer Scaling for Language Model Pretraining
Curriculum-Guided Layer Scaling for Language Model Pretraining
Karanpartap Singh
Neil Band
Ehsan Adeli
ALMLRM
231
0
0
13 Jun 2025
Self-Adapting Language Models
Self-Adapting Language Models
Adam Zweiger
Jyothish Pari
Han Guo
Ekin Akyürek
Yoon Kim
Pulkit Agrawal
KELMLRM
604
16
0
12 Jun 2025
Spelling-out is not Straightforward: LLMs' Capability of Tokenization from Token to Characters
Spelling-out is not Straightforward: LLMs' Capability of Tokenization from Token to Characters
Tatsuya Hiraoka
Kentaro Inui
264
15
0
12 Jun 2025
Low-resource domain adaptation while minimizing energy and hardware resource consumption
Hernán Maina
Nicolás Wolovick
Luciana Benotti
161
0
0
10 Jun 2025
PropMEND: Hypernetworks for Knowledge Propagation in LLMs
Zeyu Leo Liu
Greg Durrett
Eunsol Choi
KELM
148
0
0
10 Jun 2025
ZeroVO: Visual Odometry with Minimal Assumptions
ZeroVO: Visual Odometry with Minimal AssumptionsComputer Vision and Pattern Recognition (CVPR), 2025
Lei Lai
Zekai Yin
Eshed Ohn-Bar
VGen
222
3
0
09 Jun 2025
Through the Valley: Path to Effective Long CoT Training for Small Language Models
Through the Valley: Path to Effective Long CoT Training for Small Language Models
Renjie Luo
Jiaxi Li
Chen Huang
Wei Lu
LRM
233
2
0
09 Jun 2025
Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models
Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language ModelsAnnual Meeting of the Association for Computational Linguistics (ACL), 2025
Kyeonghyun Kim
Jinhee Jang
Juhwan Choi
Yoonji Lee
Kyohoon Jin
Youngbin Kim
224
0
0
09 Jun 2025
Dynamic and Parametric Retrieval-Augmented Generation
Dynamic and Parametric Retrieval-Augmented GenerationAnnual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Weihang Su
Jiaxin Mao
Jingtao Zhan
Qian Dong
Yiqun Liu
RALM
149
7
0
07 Jun 2025
Previous
12345...262728
Next