Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.10964
Cited By
v1
v2
v3 (latest)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 1,369 papers shown
Text-to-LoRA: Instant Transformer Adaption
Rujikorn Charakorn
Edoardo Cetin
Yujin Tang
Robert Tjarko Lange
AI4CE
270
6
0
06 Jun 2025
Policy Search, Retrieval, and Composition via Task Similarity in Collaborative Agentic Systems
Saptarshi Nath
Christos Peridis
Eseoghene Benjamin
Hengrong Du
Soheil Kolouri
Peter Kinnell
Zexin Li
Cong Liu
Shirin Dora
Andrea Soltoggio
307
0
0
05 Jun 2025
Building a Few-Shot Cross-Domain Multilingual NLU Model for Customer Care
European Conference on Artificial Intelligence (ECAI), 2025
Saurabh Kumar
Sourav Bansal
Neeraj Agrawal
Priyanka Bhatt
156
0
0
04 Jun 2025
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Mingxuan Xia
Haobo Wang
Shouqing Yang
Zewei Yu
Yongfeng Zhang
Junbo Zhao
Runze Wu
346
1
0
04 Jun 2025
Backbone Augmented Training for Adaptations
Jae Wan Park
Junhyeok Kim
Youngjun Jun
Hyunah Ko
Seong Jae Hwang
203
0
0
04 Jun 2025
MSDA: Combining Pseudo-labeling and Self-Supervision for Unsupervised Domain Adaptation in ASR
Dimitrios Damianos
Georgios Paraskevopoulos
Alexandros Potamianos
371
1
0
30 May 2025
Fine-Tune an SLM or Prompt an LLM? The Case of Generating Low-Code Workflows
Orlando Marquez Ayala
Patrice Béchard
Emily Chen
Maggie Baird
Jingfei Chen
215
3
0
30 May 2025
Structuring Radiology Reports: Challenging LLMs with Lightweight Models
Johannes Moll
Louisa Fay
Asfandyar Azhar
Sophie Ostmeier
Tim Lueth
S. Gatidis
Curtis P. Langlotz
Jean-Benoit Delbrouck
264
1
0
30 May 2025
Domain Pre-training Impact on Representations
César González-Gutiérrez
A. Quattoni
167
0
0
30 May 2025
Skin Lesion Phenotyping via Nested Multi-modal Contrastive Learning
Dionysis Christopoulos
Sotiris Spanos
Eirini Baltzi
Valsamis Ntouskos
Konstantinos Karantzalos
223
1
0
29 May 2025
Improving Continual Pre-training Through Seamless Data Packing
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ruicheng Yin
Xuan Gao
Changze Lv
Xiaohua Wang
Xiaoqing Zheng
Qi Zhang
268
1
0
28 May 2025
Personalized Query Auto-Completion for Long and Short-Term Interests with Adaptive Detoxification Generation
Zhibo Wang
Xiaoze Jiang
Zhiheng Qin
Enyun Yu
Han Li
184
3
0
27 May 2025
Evaluation of LLMs in Medical Text Summarization: The Role of Vocabulary Adaptation in High OOV Settings
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Gunjan Balde
Soumyadeep Roy
Mainack Mondal
Niloy Ganguly
163
1
0
27 May 2025
Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration?
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Ziming Wang
Zeyu Shi
Haoyi Zhou
Shiqi Gao
Qingyun Sun
Jianxin Li
298
4
0
27 May 2025
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Peiming Guo
Meishan Zhang
Jianling Li
Min Zhang
Yue Zhang
299
0
0
27 May 2025
Token Distillation: Attention-aware Input Embeddings For New Tokens
Konstantin Dobler
Desmond Elliott
Gerard de Melo
VLM
412
1
0
26 May 2025
Data Doping or True Intelligence? Evaluating the Transferability of Injected Knowledge in LLMs
Essa Jan
Moiz Ali
Muhammad Saram Hassan
Fareed Zaffar
Yasir Zaki
KELM
156
1
0
22 May 2025
ProMind-LLM: Proactive Mental Health Care via Causal Reasoning with Sensor Data
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Xinzhe Zheng
Sijie Ji
Jiawei Sun
Ruoxin Chen
Wei Gao
Mani Srivastava
AI4MH
LRM
198
5
0
20 May 2025
Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders
Agam Goyal
Vedant Rathi
William Yeh
Yian Wang
Yuen Chen
Hari Sundaram
363
1
0
20 May 2025
Shadow-FT: Tuning Instruct Model via Training on Paired Base Model
Taiqiang Wu
Runming Yang
Jiayi Li
Pengfei Hu
Ngai Wong
Ngai Wong
Yujiu Yang
684
1
0
19 May 2025
Krikri: Advancing Open Large Language Models for Greek
Dimitris Roussis
Leon Voukoutis
Georgios Paraskevopoulos
Sokratis Sofianopoulos
Prokopis Prokopidis
Vassilis Papavasileiou
Athanasios Katsamanis
Stelios Piperidis
Vassilis Katsouros
ALM
409
6
0
19 May 2025
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Shaobo Wang
Xiangqi Jin
Ziming Wang
Jinqiao Wang
Jingyun Zhang
...
Zichen Wen
Zhong Li
Bin Wang
Xuming Hu
Linfeng Zhang
SyDa
426
14
0
18 May 2025
Telco-oRAG: Optimizing Retrieval-augmented Generation for Telecom Queries via Hybrid Retrieval and Neural Routing
IEEE Journal on Selected Areas in Communications (JSAC), 2025
Andrei-Laurentiu Bornea
Fadhel Ayed
Antonio De Domenico
Nicola Piovesan
Tareq Si Salem
Ali Maatouk
246
2
0
17 May 2025
A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jean-Philippe Corbeil
Amin Dada
Jean-Michel Attendu
Asma Ben Abacha
Alessandro Sordoni
Lucas Caccia
François Beaulieu
Thomas Lin
Jens Kleesiek
Paul Vozila
LM&MA
362
11
0
15 May 2025
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
Shaurya Sharthak
Vinayak Pahalwan
Adithya Kamath
Adarsh Shirawalmath
CLL
VLM
395
1
0
14 May 2025
Training Strategies for Efficient Embodied Reasoning
William Chen
Suneel Belkhale
Suvir Mirchandani
Oier Mees
Danny Driess
Karl Pertsch
Sergey Levine
OffRL
LRM
425
26
0
13 May 2025
Large Language Models Meet Stance Detection: A Survey of Tasks, Methods, Applications, Challenges and Future Directions
Lata Pangtey
Anukriti Bhatnagar
Shubhi Bansal
Shahid Shafi Dar
Nagendra Kumar
301
3
0
13 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
230
2
0
05 May 2025
Investigating Task Arithmetic for Zero-Shot Information Retrieval
Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), 2025
Marco Braga
Pranav Kasela
Alessandro Raganato
G. Pasi
RALM
363
1
0
01 May 2025
Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare
Machine Learning and Knowledge Extraction (MLKE), 2025
Lovedeep Gondara
Jonathan Simkin
Graham Sayle
Shebnum Devji
Gregory Arbour
Raymond Ng
LM&MA
163
3
0
29 Apr 2025
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Paul Kassianik
Baturay Saglam
Alexander Chen
Blaine Nelson
Anu Vellore
...
Hyrum Anderson
Kojin Oshiba
Omar Santos
Yaron Singer
Amin Karbasi
PILM
290
18
0
28 Apr 2025
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language
International Conference on Text, Speech and Dialogue (TSD), 2025
Anastasia Zhukova
Christian E. Matt
Terry Ruas
CLL
VLM
428
2
0
28 Apr 2025
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Wataru Kawakami
Keita Suzuki
Junichiro Iwasawa
LRM
332
3
0
25 Apr 2025
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation
Gwen Yidou Weng
Benjie Wang
Karen Ullrich
BDL
883
4
0
25 Apr 2025
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation
North American Chapter of the Association for Computational Linguistics (NAACL), 2025
Luca Moroni
Giovanni Puccetti
Pere-Lluís Huguet Cabot
Andrei Stefan Bejgu
Edoardo Barba
Alessio Miaschi
F. Dell’Orletta
Andrea Esuli
Roberto Navigli
284
5
0
23 Apr 2025
T-VEC: A Telecom-Specific Vectorization Model with Enhanced Semantic Understanding via Deep Triplet Loss Fine-Tuning
Vignesh Ethiraj
Ashwath David
Sidhanth Menon
Divya Vijay
Vidhyakshaya Kannan
237
2
0
23 Apr 2025
Knowledge Distillation and Dataset Distillation of Large Language Models: Emerging Trends, Challenges, and Future Directions
Luyang Fang
Xiaowei Yu
Jianfeng Cai
Yongkai Chen
Shushan Wu
...
Wenxuan Zhong
Tianming Liu
Ping Ma
Tianming Liu
Ping Ma
ALM
272
13
0
20 Apr 2025
Probing the Subtle Ideological Manipulation of Large Language Models
Demetris Paschalides
G. Pallis
M. Dikaiakos
186
0
0
19 Apr 2025
Continual Pre-Training is (not) What You Need in Domain Adaption
Pin-Er Chen
Da-Chen Lian
S. Hsieh
Sieh-Chuen Huang
Hsuan-Lei Shao
...
Yang-Hsien Lin
Zih-Ching Chen
Cheng-Kuang
Eddie TC Huang
Simon See
CLL
AILaw
314
1
0
18 Apr 2025
Memorization vs. Reasoning: Updating LLMs with New Knowledge
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Aochong Oliver Li
Tanya Goyal
KELM
345
8
0
16 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
443
7
0
15 Apr 2025
Enhancing Dialogue Systems with Discourse-Level Understanding Using Deep Canonical Correlation Analysis
Akanksha Mehndiratta
Krishna Asawa
63
0
0
12 Apr 2025
Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks
Xiaomei Zhang
Zhaoxi Zhang
Yanjun Zhang
Xufei Zheng
L. Zhang
Shengshan Hu
Shirui Pan
AAML
238
2
0
08 Apr 2025
Mapping biodiversity at very-high resolution in Europe
César Leblanc
Lukás Picek
Benjamin Deneu
P. Bonnet
Maximilien Servajean
Rémi Palard
Alexis Joly
165
4
0
07 Apr 2025
GraphSeg: Segmented 3D Representations via Graph Edge Addition and Contraction
Haozhan Tang
Tianyi Zhang
Oliver Kroemer
Matthew Johnson-Roberson
Weiming Zhi
3DPC
250
1
0
04 Apr 2025
On the Connection Between Diffusion Models and Molecular Dynamics
Liam Harcombe
Timothy T. Duignan
DiffM
329
1
0
04 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Annual Meeting of the Association for Computational Linguistics (ACL), 2025
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
418
4
0
02 Apr 2025
Context-Aware Toxicity Detection in Multiplayer Games: Integrating Domain-Adaptive Pretraining and Match Metadata
Adrien Schurger-Foy
Rafal Kocielnik
Caglar Gulcehre
R. Alvarez
189
0
0
02 Apr 2025
Beyond Vanilla Fine-Tuning: Leveraging Multistage, Multilingual, and Domain-Specific Methods for Low-Resource Machine Translation
Sarubi Thillainathan
Songchen Yuan
E. Lee
Sanath Jayasena
Surangika Ranathunga
277
1
0
28 Mar 2025
Penrose Tiled Low-Rank Compression and Section-Wise Q&A Fine-Tuning: A General Framework for Domain-Specific Large Language Model Adaptation
Chuan-Wei Kuo
Siyu Chen
Chenqi Yan
Yu Liu
201
0
0
28 Mar 2025
Previous
1
2
3
4
5
6
...
26
27
28
Next