Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 332 papers shown
Title
Achieving Tokenizer Flexibility in Language Models through Heuristic Adaptation and Supertoken Learning
Shaurya Sharthak
Vinayak Pahalwan
Adithya Kamath
Adarsh Shirawalmath
CLL
VLM
41
0
0
14 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
29
0
0
05 May 2025
Efficient Domain-adaptive Continual Pretraining for the Process Industry in the German Language
Anastasia Zhukova
Christian E. Matt
Terry Ruas
Bela Gipp
CLL
VLM
98
0
0
28 Apr 2025
Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report
Paul Kassianik
Baturay Saglam
Alexander Chen
Blaine Nelson
Anu Vellore
...
Hyrum Anderson
Kojin Oshiba
Omar Santos
Yaron Singer
Amin Karbasi
PILM
61
0
0
28 Apr 2025
TRACE Back from the Future: A Probabilistic Reasoning Approach to Controllable Language Generation
Gwen Yidou Weng
Benjie Wang
Guy Van den Broeck
BDL
102
0
0
25 Apr 2025
Stabilizing Reasoning in Medical LLMs with Continued Pretraining and Reasoning Preference Optimization
Wataru Kawakami
Keita Suzuki
Junichiro Iwasawa
LRM
68
0
0
25 Apr 2025
CSPLADE: Learned Sparse Retrieval with Causal Language Models
Zhichao Xu
Aosong Feng
Yijun Tian
Haibo Ding
Lin Leee Cheong
RALM
40
0
0
15 Apr 2025
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
Jeffrey Li
Mohammadreza Armandpour
Iman Mirzadeh
Sachin Mehta
Vaishaal Shankar
...
Samy Bengio
Oncel Tuzel
Mehrdad Farajtabar
Hadi Pouransari
Fartash Faghri
CLL
KELM
59
0
0
02 Apr 2025
OmniScience: A Domain-Specialized LLM for Scientific Reasoning and Discovery
Vignesh Prabhakar
Md Amirul Islam
Adam Atanas
Y. Wang
J. N. Han
...
Rucha Apte
Robert Clark
Kang Xu
Zihan Wang
Kai Liu
LRM
80
1
0
22 Mar 2025
A Dataset for Analysing News Framing in Chinese Media
Owen Cook
Yida Mu
Xinye Yang
Xingyi Song
Kalina Bontcheva
65
1
0
06 Mar 2025
Personalize Your LLM: Fake it then Align it
Yijing Zhang
Dyah Adila
Changho Shin
Frederic Sala
86
0
0
02 Mar 2025
NaijaNLP: A Survey of Nigerian Low-Resource Languages
Isa Inuwa-Dutse
42
0
0
27 Feb 2025
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
48
2
0
21 Feb 2025
Evaluating Implicit Bias in Large Language Models by Attacking From a Psychometric Perspective
Yuchen Wen
Keping Bi
Wei Chen
J. Guo
Xueqi Cheng
81
1
0
20 Feb 2025
Why does my medical AI look at pictures of birds? Exploring the efficacy of transfer learning across domain boundaries
F. Jonske
M. Kim
Enrico Nasca
J. Evers
Johannes Haubold
...
F. Nensa
Michael Kamp
C. Seibold
Jan Egger
Jens Kleesiek
79
1
0
17 Feb 2025
FinMTEB: Finance Massive Text Embedding Benchmark
Yixuan Tang
Yi Yang
AIFin
63
0
0
16 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria L. Gini
76
2
0
10 Feb 2025
Detecting harassment and defamation in cyberbullying with emotion-adaptive training
Peiling Yi
A. Zubiaga
Yunfei Long
78
0
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Erik Cambria
LM&MA
AILaw
93
153
0
28 Jan 2025
Speech Translation Refinement using Large Language Models
Huaixia Dou
Xinyu Tian
Xinglin Lyu
Jie Zhu
Junhui Li
Lifan Guo
116
0
0
28 Jan 2025
Addressing Bias in Generative AI: Challenges and Research Opportunities in Information Management
Xiahua Wei
Naveen Kumar
Han Zhang
61
3
0
22 Jan 2025
News Without Borders: Domain Adaptation of Multilingual Sentence Embeddings for Cross-lingual News Recommendation
Andreea Iana
Fabian David Schmidt
Goran Glavas
Heiko Paulheim
63
3
0
20 Jan 2025
The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better
Scott Geng
Cheng-Yu Hsieh
Vivek Ramanujan
Matthew Wallingford
Chun-Liang Li
Pang Wei Koh
Ranjay Krishna
DiffM
60
6
0
03 Jan 2025
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao
Shiqian Su
X. Zhu
Chenyu Zhang
Zhe Chen
...
Wenhai Wang
Lewei Lu
Gao Huang
Yu Qiao
Jifeng Dai
MLLM
VLM
104
2
0
20 Dec 2024
A recent evaluation on the performance of LLMs on radiation oncology physics using questions of randomly shuffled options
Peilong Wang
J. Holmes
Z. Liu
Dequan Chen
Tianming Liu
Jiajian Shen
W. Liu
LRM
ELM
LM&MA
76
0
0
14 Dec 2024
CPRM: A LLM-based Continual Pre-training Framework for Relevance Modeling in Commercial Search
Kaixin Wu
Yixin Ji
Z. Chen
Qiang Wang
Cunxiang Wang
...
Jia Xu
Zhongyi Liu
Jinjie Gu
Yuan Zhou
Linjian Mo
KELM
CLL
92
0
0
02 Dec 2024
Efficient Alignment of Large Language Models via Data Sampling
Amrit Khera
Rajat Ghosh
Debojyoti Dutta
33
1
0
15 Nov 2024
Gradient Localization Improves Lifelong Pretraining of Language Models
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
31
1
0
07 Nov 2024
DELIFT: Data Efficient Language model Instruction Fine Tuning
Ishika Agarwal
Krishnateja Killamsetty
Lucian Popa
Marina Danilevksy
ALM
VLM
46
2
0
07 Nov 2024
Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models
Minki Kang
Sung Ju Hwang
Gibbeum Lee
Jaewoong Cho
KELM
32
0
0
01 Nov 2024
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment
Elyas Obbad
Iddah Mlauzi
Brando Miranda
Rylan Schaeffer
Kamal Obbad
Suhana Bedi
Sanmi Koyejo
CVBM
48
0
0
23 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
103
0
0
17 Oct 2024
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Ambuje Gupta
Mrinal Rawat
Andreas Stolcke
Roberto Pieraccini
RALM
19
1
0
16 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
39
0
0
12 Oct 2024
Extracting and Transferring Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
B. Wang
Weipeng Chen
Ji-Rong Wen
63
0
0
10 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
41
12
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
112
0
0
07 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
36
3
0
30 Sep 2024
Do We Need Domain-Specific Embedding Models? An Empirical Investigation
Yixuan Tang
Yi Yang
AIFin
38
3
0
27 Sep 2024
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM
Sijie Ji
Xinzhe Zheng
Jiawei Sun
Renqi Chen
Wei Gao
Mani Srivastava
AI4MH
32
3
0
16 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
39
1
0
11 Sep 2024
LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
Lipeng Ma
Weidong Yang
Sihang Jiang
Ben Fei
Mingjie Zhou
Shuhao Li
Bo Xu
Bo Xu
Yanghua Xiao
51
0
0
03 Sep 2024
From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education
Unggi Lee
Jiyeong Bae
Yeonji Jung
Minji Kang
Gyuri Byun
...
Sookbun Lee
Jaekwon Park
Taekyung Ahn
Gunho Lee
Hyeoncheol Kim
AI4Ed
KELM
26
1
0
31 Aug 2024
Diffusion Guided Language Modeling
Justin Lovelace
Varsha Kishore
Yiwei Chen
Kilian Q. Weinberger
36
6
0
08 Aug 2024
Automated Review Generation Method Based on Large Language Models
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
69
0
0
30 Jul 2024
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
S. Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
38
1
0
24 Jul 2024
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
Zeyu Leo Liu
Shrey Pandit
Xi Ye
Eunsol Choi
Greg Durrett
KELM
ALM
66
4
0
08 Jul 2024
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu
Zexin Bi
Fusheng Wang
Chao Chen
42
5
0
06 Jul 2024
Using LLMs to label medical papers according to the CIViC evidence model
Markus Hisch
Xing David Wang
31
0
0
05 Jul 2024
1
2
3
4
5
6
7
Next