Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.10964
Cited By
v1
v2
v3 (latest)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 1,369 papers shown
Harnessing Diversity for Important Data Selection in Pretraining Large Language Models
International Conference on Learning Representations (ICLR), 2024
Chi Zhang
Huaping Zhong
Kuan Zhang
Chengliang Chai
Rui Wang
...
Lei Cao
Ju Fan
Ye Yuan
Guoren Wang
Conghui He
TDI
252
28
0
25 Sep 2024
Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions
Zeyneb N. Kaya
Souvick Ghosh
129
0
0
25 Sep 2024
OSINT Clinic: Co-designing AI-Augmented Collaborative OSINT Investigations for Vulnerability Assessment
International Conference on Human Factors in Computing Systems (CHI), 2024
Anirban Mukhopadhyay
Kurt Luther
238
3
0
18 Sep 2024
MindGuard: Towards Accessible and Sitgma-free Mental Health First Aid via Edge LLM
Sijie Ji
Xinzhe Zheng
Jiawei Sun
Renqi Chen
Wei Gao
Mani Srivastava
AI4MH
244
10
0
16 Sep 2024
Gaps or Hallucinations? Gazing into Machine-Generated Legal Analysis for Fine-grained Text Evaluations
Abe Bohan Hou
William Jurayj
Nils Holzenberger
Andrew Blair-Stanek
Benjamin Van Durme
ELM
252
2
0
16 Sep 2024
Towards understanding evolution of science through language model series
Junjie Dong
Zhuoqi Lyu
Qing Ke
AI4TS
409
1
0
15 Sep 2024
DomURLs_BERT: Pre-trained BERT-based Model for Malicious Domains and URLs Detection and Classification
Abdelkader El Mahdaouy
Salima Lamsiyah
Meryem Janati Idrissi
H. Alami
Zakaria Yartaoui
Ismail Berrada
142
9
0
13 Sep 2024
Self-Masking Networks for Unsupervised Adaptation
German Conference on Pattern Recognition (DAGM), 2024
Alfonso Taboada Warmerdam
Mathilde Caron
Yuki M. Asano
305
2
0
11 Sep 2024
Synthetic continued pretraining
International Conference on Learning Representations (ICLR), 2024
Zitong Yang
Neil Band
Shuangping Li
Emmanuel Candès
Tatsunori Hashimoto
CLL
SyDa
355
35
0
11 Sep 2024
A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2024
Ningyuan Xi
Yetao Wu
Kun Fan
Teng Chen
Qingqing Gu
Peng Yu
ALM
191
0
0
10 Sep 2024
A Comparative Study of Pre-training and Self-training
Yiheng Wang
Jiayu Lin
Zuoquan Lin
SSL
323
1
0
04 Sep 2024
LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models
IEEE Transactions on Software Engineering (TSE), 2024
Lipeng Ma
Weidong Yang
Sihang Jiang
Ben Fei
Mingjie Zhou
Shuhao Li
Bo Xu
Bo Xu
Yanghua Xiao
397
4
0
03 Sep 2024
Pre-Trained Language Models for Keyphrase Prediction: A Review
ICT express (IE), 2024
Muhammad Umair
Tangina Sultana
Young-Koo Lee
313
8
0
02 Sep 2024
From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education
Unggi Lee
Jiyeong Bae
Yeonji Jung
Minji Kang
Gyuri Byun
...
Sookbun Lee
Jaekwon Park
Taekyung Ahn
Gunho Lee
Hyeoncheol Kim
AI4Ed
KELM
249
2
0
31 Aug 2024
Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Nikolas Gritsch
Qizhen Zhang
Acyr Locatelli
Sara Hooker
Ahmet Üstün
MoE
213
7
0
28 Aug 2024
Language Adaptation on a Tight Academic Compute Budget: Tokenizer Swapping Works and Pure bfloat16 Is Enough
Konstantin Dobler
Gerard de Melo
204
4
0
28 Aug 2024
Prior-free Balanced Replay: Uncertainty-guided Reservoir Sampling for Long-Tailed Continual Learning
ACM Multimedia (MM), 2024
Lei Liu
Li Liu
Yawen Cui
CLL
222
1
0
27 Aug 2024
CIPHER: Cybersecurity Intelligent Penetration-testing Helper for Ethical Researcher
Italian National Conference on Sensors (INS), 2024
Derry Pratama
Naufal Suryanto
Andro Aprila Adiputra
Thi-Thu-Huong Le
Ahmada Yusril Kadiptya
Muhammad Iqbal
Howon Kim
210
18
0
21 Aug 2024
Predicting Rewards Alongside Tokens: Non-disruptive Parameter Insertion for Efficient Inference Intervention in Large Language Model
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Chenhan Yuan
Fei Huang
Ru Peng
Keming Lu
Bowen Yu
Chang Zhou
Jingren Zhou
KELM
217
0
0
20 Aug 2024
Summarizing long regulatory documents with a multi-step pipeline
Mika Sie
Ruby Beek
Michiel Bots
S. Brinkkemper
Albert Gatt
AILaw
ELM
180
5
0
19 Aug 2024
NoRA: Nested Low-Rank Adaptation for Efficient Fine-Tuning Large Models
Cheng Lin
Lujun Li
Dezhi Li
Jie Zou
Wei Xue
Yike Guo
AI4TS
265
14
0
18 Aug 2024
Diffusion Guided Language Modeling
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Justin Lovelace
Varsha Kishore
Yiwei Chen
Kilian Q. Weinberger
294
10
0
08 Aug 2024
Automated Review Generation Method Based on Large Language Models
National Science Review (NSR), 2024
Shican Wu
Xiao Ma
Dehui Luo
Lulu Li
Xiangcheng Shi
...
Ran Luo
Chunlei Pei
Zhijian Zhao
Zhi-Jian Zhao
Jinlong Gong
558
9
0
30 Jul 2024
Do LLMs Really Adapt to Domains? An Ontology Learning Perspective
International Workshop on the Semantic Web (SW), 2024
Huu Tan Mai
Cuong Xuan Chu
Heiko Paulheim
173
23
0
29 Jul 2024
Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery
Yuni Susanti
Michael Färber
224
9
0
26 Jul 2024
ChipExpert: The Open-Source Integrated-Circuit-Design-Specific Large Language Model
Ning Xu
Zhaoyang Zhang
Lei Qi
Wensuo Wang
Chao Zhang
...
Mengyao Zhao
Junbo Liu
Yufan Song
Xin Geng
Jun Yang
125
3
0
26 Jul 2024
CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models
Jiawei Gu
Zacc Yang
Chuanghao Ding
Rui Zhao
Fei Tan
CLL
309
17
0
24 Jul 2024
A Novel Two-Step Fine-Tuning Pipeline for Cold-Start Active Learning in Text Classification Tasks
F. Belém
Washington Cunha
Celso França
Claudio Andrade
Leonardo Rocha
M. A. Gonçalves
138
3
0
24 Jul 2024
Towards Aligning Language Models with Textual Feedback
Sauc Abadal Lloret
Shehzaad Dhuliawala
K. Murugesan
Mrinmaya Sachan
VLM
374
1
0
24 Jul 2024
Structure-aware Domain Knowledge Injection for Large Language Models
Kai-Chun Liu
Ze Chen
Zhihang Fu
Rongxin Jiang
Fan Zhou
Yao-Shen Chen
Yue-bo Wu
Yue Wu
Jieping Ye
178
0
0
23 Jul 2024
Domain-Specific Pretraining of Language Models: A Comparative Study in the Medical Field
Tobias Kerner
ELM
LM&MA
306
6
0
19 Jul 2024
ChipXplore: Natural Language Exploration of Hardware Designs and Libraries
Manar Abdelatty
Sherief Reda
Sherief Reda
257
0
0
17 Jul 2024
On Large Language Model Continual Unlearning
Chongyang Gao
Lixu Wang
Chenkai Weng
Tianlin Li
Qi Zhu
Qi Zhu
MU
271
0
0
14 Jul 2024
The Sociolinguistic Foundations of Language Modeling
Jack Grieve
Sara Bartl
Matteo Fuoli
Jason Grafmiller
Weihang Huang
A. Jawerbaum
Akira Murakami
Marcus Perlman
Dana Roemling
Bodo Winter
309
26
0
12 Jul 2024
Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
K. Kenthapadi
M. Sameki
Ankur Taly
HILM
ELM
AILaw
220
33
0
10 Jul 2024
Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models
Jupinder Parmar
Sanjev Satheesh
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
286
53
0
09 Jul 2024
CodeUpdateArena: Benchmarking Knowledge Editing on API Updates
Zeyu Leo Liu
Shrey Pandit
Xi Ye
Eunsol Choi
Greg Durrett
KELM
ALM
405
13
0
08 Jul 2024
BadCLM: Backdoor Attack in Clinical Language Models for Electronic Health Records
Weimin Lyu
Zexin Bi
Fusheng Wang
Chao Chen
253
10
0
06 Jul 2024
Using LLMs to label medical papers according to the CIViC evidence model
Markus Hisch
Xing David Wang
196
0
0
05 Jul 2024
Multi-Task Domain Adaptation for Language Grounding with 3D Objects
Yixiang Chen
Yaoxian Song
Xinglin Pan
Peijie Dong
Xiaofei Yang
Qiang-qiang Wang
Zhixu Li
Tiefeng Li
Xiaowen Chu
286
2
0
03 Jul 2024
Sociocultural Considerations in Monitoring Anti-LGBTQ+ Content on Social Media
Sidney G. -J. Wong
149
0
0
01 Jul 2024
M2QA: Multi-domain Multilingual Question Answering
Leon Arne Engländer
Hannah Sterz
Clifton A. Poth
Jonas Pfeiffer
Ilia Kuznetsov
Iryna Gurevych
VLM
254
5
0
01 Jul 2024
Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation
Hye Ryung Son
Jay-Yoon Lee
168
3
0
30 Jun 2024
KPC-cF: Aspect-Based Sentiment Analysis via Implicit-Feature Alignment with Corpus Filtering
Kibeom Nam
408
0
0
29 Jun 2024
SMLT-MUGC: Small, Medium, and Large Texts -- Machine versus User-Generated Content Detection and Comparison
Anjali Rawal
Hui Wang
Youjia Zheng
Yu-Hsuan Lin
Shanu Sushmita
DeLMO
189
0
0
28 Jun 2024
ProgressGym: Alignment with a Millennium of Moral Progress
Tianyi Qiu
Yang Zhang
Xuchuan Huang
Jasmine Xinze Li
Yalan Qin
Yaodong Yang
AI4TS
278
9
0
28 Jun 2024
CHEW: A Dataset of CHanging Events in Wikipedia
Hsuvas Borkakoty
Luis Espinosa-Anke
234
2
0
27 Jun 2024
MPCODER: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
Zhenlong Dai
Chang Yao
Wenkang Han
Ying Yuan
Zhipeng Gao
Jingyuan Chen
185
22
0
25 Jun 2024
Task Oriented In-Domain Data Augmentation
Xiao Liang
Xinyu Hu
Simiao Zuo
Yeyun Gong
Qiang Lou
Yi Liu
Shao-Lun Huang
Jian Jiao
194
8
0
24 Jun 2024
Evaluating the Effectiveness of the Foundational Models for Q&A Classification in Mental Health care
Hassan Alhuzali
Ashwag Alasmari
AI4MH
262
4
0
23 Jun 2024
Previous
1
2
3
...
5
6
7
...
26
27
28
Next