Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
All Papers
0 / 0 papers shown
Title
Home
Papers
2004.10964
Cited By
v1
v2
v3 (latest)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 1,369 papers shown
Title
Gradient Descent with Provably Tuned Learning-rate Schedules
Dravyansh Sharma
84
0
0
04 Dec 2025
Adapting Large Language Models to Low-Resource Tibetan: A Two-Stage Continual and Supervised Fine-Tuning Study
Lifeng Chen
Ryan Lai
Tianming Liu
CLL
132
0
0
03 Dec 2025
Comparative Analysis of 47 Context-Based Question Answer Models Across 8 Diverse Datasets
Muhammad Muneeb
David B. Ascher
Ahsan Baidar Bakht
48
0
0
29 Nov 2025
Mortgage Language Model: Domain-Adaptive Pretraining with Residual Instruction, Alignment Tuning, and Task-Specific Routing
Manish Jain
Satheesh Kumar Ponnambalam
Salman Faroz
Chandrakanth Lns
Vinay Sharma
ALM
627
0
0
26 Nov 2025
Building Domain-Specific Small Language Models via Guided Data Generation
Aman Kumar
Ekant Muljibhai Amin
Xian Yeow Lee
Lasitha Vidyaratne
Ahmed K. Farahat
Dipanjan Ghosh
Yuta Koreeda
Chetan Gupta
ALM
172
0
0
23 Nov 2025
Bridging VLMs and Embodied Intelligence with Deliberate Practice Policy Optimization
Yi Zhang
Che Liu
Xiancong Ren
Hanchu Ni
Yingji Zhang
...
Zenglin Xu
Bin Shen
Qifan Wang
Jian Tang
Xiaozhu Ju
VLM
148
0
0
20 Nov 2025
Classification of Hope in Textual Data using Transformer-Based Models
Chukwuebuka Ijezue
Tania-Amanda Nkoyo Fredrick Eneye
Maaz Amjad
VLM
150
0
0
17 Nov 2025
Tokenize Once, Recommend Anywhere: Unified Item Tokenization for Multi-domain LLM-based Recommendation
Yu Hou
Won-Yong Shin
66
0
0
17 Nov 2025
NeuroLex: A Lightweight Domain Language Model for EEG Report Understanding and Generation
Kang Yin
Hye-Bin Shin
120
0
0
17 Nov 2025
Evaluating the Ability of Large Language Models to Identify Adherence to CONSORT Reporting Guidelines in Randomized Controlled Trials: A Methodological Evaluation Study
Zhichao He
Mouxiao Bian
Jianhong Zhu
Jiayuan Chen
Y Samuel Wang
Wenxia Zhao
Tianbin Li
Bing Han
Jie Xu
J. Wu
68
0
0
17 Nov 2025
Concept-Based Interpretability for Toxicity Detection
Samarth Garg
Deeksha Varshney
Divya Singh
Mamta
93
0
0
15 Nov 2025
Kunlun Anomaly Troubleshooter: Enabling Kernel-Level Anomaly Detection and Causal Reasoning for Large Model Distributed Inference
Yuyang Liu
Jingjing Cai
Jiayi Ren
Peng Zhou
Danyang Zhang
Yin Du
Shijian Li
114
0
0
08 Nov 2025
ManufactuBERT: Efficient Continual Pretraining for Manufacturing
Robin Armingaud
Romaric Besançon
72
0
0
07 Nov 2025
MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation
Shih-Lun Wu
Yoon Kim
Cheng-Zhi Anna Huang
312
0
0
06 Nov 2025
BIRD: Bronze Inscription Restoration and Dating
Wenjie Hua
Hoang H. Nguyen
Gangyan Ge
AI4CE
139
0
0
03 Nov 2025
Exploring and Mitigating Gender Bias in Encoder-Based Transformer Models
Ariyan Hossain
Khondokar Mohammad Ahanaf Hannan
Rakinul Haque
Nowreen Tarannum Rafa
Humayra Musarrat
Shoaib Ahmed Dipu
Farig Yousuf Sadeque
93
0
0
01 Nov 2025
Multilingual BERT language model for medical tasks: Evaluation on domain-specific adaptation and cross-linguality
Yinghao Luo
Lang Zhou
Amrish Jhingoer
Klaske Vliegenthart--Jongbloed
Carlijn Jordans
Ben Werkhoven
T. Seinen
E. V. Mulligen
Casper Rokx
Yunlei Li
LM&MA
179
0
0
31 Oct 2025
From Amateur to Master: Infusing Knowledge into LLMs via Automated Curriculum Learning
Nishit Neema
Srinjoy Mukherjee
Sapan Shah
Gokul Ramakrishnan
Ganesh Venkatesh
CLL
248
0
0
30 Oct 2025
Beyond One-Size-Fits-All: Personalized Harmful Content Detection with In-Context Learning
Rufan Zhang
Lin Zhang
Xianghang Mi
76
0
0
29 Oct 2025
A Survey on LLM Mid-Training
Chengying Tu
Xuemiao Zhang
Rongxiang Weng
Rumei Li
Chen Zhang
Yang Bai
Hongfei Yan
Jingang Wang
Xunliang Cai
OffRL
LRM
233
1
0
27 Oct 2025
Generating Auxiliary Tasks with Reinforcement Learning
Judah Goldfeder
Matthew So
Hod Lipson
OffRL
238
0
0
27 Oct 2025
Network Intrusion Detection: Evolution from Conventional Approaches to LLM Collaboration and Emerging Risks
Yaokai Feng
Kouichi Sakurai
191
1
0
27 Oct 2025
Robust Uncertainty Quantification for Self-Evolving Large Language Models via Continual Domain Pretraining
Xiaofan Zhou
Lu Cheng
CLL
373
0
0
27 Oct 2025
Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
Eeham Khan
Firas Saidani
Owen Van Esbroeck
Richard Khoury
Leila Kosseim
132
0
0
26 Oct 2025
From Slides to Chatbots: Enhancing Large Language Models with University Course Materials
Tu Anh Dinh
Philipp Nicolas Schumacher
Jan Niehues
70
0
0
25 Oct 2025
PatenTEB: A Comprehensive Benchmark and Model Family for Patent Text Embedding
Iliass Ayaou
Denis Cavallucci
88
0
0
25 Oct 2025
VESSA: Video-based objEct-centric Self-Supervised Adaptation for Visual Foundation Models
Jesimon Barreto
C. Caetano
A. Araújo
William Robson Schwartz
VLM
132
0
0
23 Oct 2025
IKnow: Instruction-Knowledge-Aware Continual Pretraining for Effective Domain Adaptation
Tianyi Zhang
Florian Mai
Lucie Flek
CLL
97
0
0
23 Oct 2025
Adapting Multilingual Models to Code-Mixed Tasks via Model Merging
Prashant Kodali
Vaishnavi Shivkumar
Swarang Joshi
Monojit Choudhary
Ponnurangam Kumaraguru
Manish Shrivastava
MoMe
CLL
322
0
0
22 Oct 2025
AtlasKV: Augmenting LLMs with Billion-Scale Knowledge Graphs in 20GB VRAM
Haoyu Huang
Hong Ting Tsang
Jiaxin Bai
Xi Peng
Gong Zhang
Yangqiu Song
VLM
166
0
0
20 Oct 2025
Qomhra: A Bilingual Irish and English Large Language Model
Joseph McInerney
Khanh-Tung Tran
Liam Lonergan
Ailbhe Ní Chasaide
Neasa Ní Chiaráin
Barry Devereux
134
0
0
20 Oct 2025
Midtraining Bridges Pretraining and Posttraining Distributions
Emmy Liu
Graham Neubig
Chenyan Xiong
CLL
184
1
0
16 Oct 2025
Cognitive-Aligned Spatio-Temporal Large Language Models For Next Point-of-Interest Prediction
Penglong Zhai
Jie Li
Fanyi Di
Yue Liu
Yifang Yuan
...
S. Wang
Mingyang Yin
Tingting Hu
Yao Xu
Xin Li
129
0
0
16 Oct 2025
First Attentions Last: Better Exploiting First Attentions for Efficient Transformer Training
Gyudong Kim
Hyukju Na
Jin Hyeon Kim
Hyunsung Jang
Jaemin Park
J. Hwang
Namkoo Ha
Seungryong Kim
Young Geun Kim
92
0
0
16 Oct 2025
Sparse Subnetwork Enhancement for Underrepresented Languages in Large Language Models
Daniil Gurgurov
Josef van Genabith
Simon Ostermann
MoE
194
0
0
15 Oct 2025
A-IPO: Adaptive Intent-driven Preference Optimization
Wenqing Wang
Muhammad Asif Ali
Ali Shoker
Ruohan Yang
Junyang Chen
Ying Sha
Huan Wang
81
0
0
11 Oct 2025
Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics
Lianhao Zhou
Hongyi Ling
Cong Fu
Yepeng Huang
Michael Sun
...
X. Qian
Heng Ji
Wei Wang
Marinka Zitnik
Shuiwang Ji
LLMAG
LM&Ro
AI4CE
168
3
0
10 Oct 2025
Understanding the Effects of Domain Finetuning on LLMs
Eshaan Tanwar
Deepak Nathani
William Yang Wang
Tanmoy Chakraborty
128
0
0
10 Oct 2025
SkipSR: Faster Super Resolution with Token Skipping
Rohan Choudhury
Shanchuan Lin
Jianyi Wang
Hao Chen
Qi Zhao
Feng Cheng
Lu Jiang
Kris Kitani
László A. Jeni
SupR
209
0
0
09 Oct 2025
DACIP-RC: Domain Adaptive Continual Instruction Pre-Training via Reading Comprehension on Business Conversations
Elena Khasanova
Harsh Saini
Md Tahmid Rahman Laskar
Xue-Yong Fu
Cheng Chen
Shashi Bhushan TN
CLL
104
0
0
09 Oct 2025
SliceFine: The Universal Winning-Slice Hypothesis for Pretrained Networks
Md. Kowsher
Ali O. Polat
Ehsan Mohammady Ardehaly
Mehrdad Salehi
Zia Ghiasi
Prasanth Murali
Chen Chen
166
1
0
09 Oct 2025
Beyond Monolingual Assumptions: A Survey of Code-Switched NLP in the Era of Large Language Models across Modalities
Rajvee Sheth
Samridhi Raj Sinha
Mahavir Patil
Himanshu Beniwal
Mayank Singh
251
0
0
08 Oct 2025
Reward Model Perspectives: Whose Opinions Do Reward Models Reward?
Elle
ALM
132
1
0
07 Oct 2025
Contrastive Learning Using Graph Embeddings for Domain Adaptation of Language Models in the Process Industry
Anastasia Zhukova
Jonas Lührs
Christian E. Lobmüller
Bela Gipp
157
0
0
06 Oct 2025
AWARE, Beyond Sentence Boundaries: A Contextual Transformer Framework for Identifying Cultural Capital in STEM Narratives
Khalid Mehtab Khan
Anagha Kulkarni
82
0
0
06 Oct 2025
Train on Validation (ToV): Fast data selection with applications to fine-tuning
Ayush Jain
Andrea Montanari
Eren Sasoglu
153
1
0
01 Oct 2025
CustomIR: Unsupervised Fine-Tuning of Dense Embeddings for Known Document Corpora
Nathan Paull
94
0
0
30 Sep 2025
Metaphor identification using large language models: A comparison of RAG, prompt engineering, and fine-tuning
Matteo Fuoli
Weihang Huang
Jeannette Littlemore
Sarah Turner
Ellen Wilding
167
0
0
29 Sep 2025
Predicting Training Re-evaluation Curves Enables Effective Data Curriculums for LLMs
Shane Bergsma
Nolan Dey
Joel Hestness
150
0
0
29 Sep 2025
WirelessMathLM: Teaching Mathematical Reasoning for LLMs in Wireless Communications with Reinforcement Learning
Xin Li
Mengbing Liu
Yiyang Zhu
W. Zhang
Li Wei
Jiancheng An
Chau Yuen
LRM
66
0
0
27 Sep 2025
1
2
3
4
...
26
27
28
Next