Communities
Connect sessions
AI calendar
Organizations
Join Slack
Contact Sales
Search
Open menu
Home
Papers
2004.10964
Cited By
v1
v2
v3 (latest)
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 1,369 papers shown
SPICA: Retrieving Scenarios for Pluralistic In-Context Alignment
Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Quan Ze Chen
K. J. Kevin Feng
Chan Young Park
Amy X. Zhang
230
2
0
16 Nov 2024
Efficient Alignment of Large Language Models via Data Sampling
Amrit Khera
Rajat Ghosh
Debojyoti Dutta
496
1
0
15 Nov 2024
Unstructured Text Enhanced Open-domain Dialogue System: A Systematic Survey
Longxuan Ma
Mingda Li
Weinan Zhang
Jiapeng Li
Ting Liu
354
19
0
14 Nov 2024
Over-parameterized Student Model via Tensor Decomposition Boosted Knowledge Distillation
Neural Information Processing Systems (NeurIPS), 2024
Yu-Liang Zhan
Zhong-Yi Lu
Hao Sun
Ze-Feng Gao
250
2
0
10 Nov 2024
CoPrompter: User-Centric Evaluation of LLM Instruction Alignment for Improved Prompt Engineering
International Conference on Intelligent User Interfaces (IUI), 2024
Ishika Joshi
Simra Shahid
Shreeya Venneti
Manushree Vasu
Yantao Zheng
Yunyao Li
Balaji Krishnamurthy
Gromit Yeuk-Yin Chan
304
19
0
09 Nov 2024
Gradient Localization Improves Lifelong Pretraining of Language Models
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Jared Fernandez
Yonatan Bisk
Emma Strubell
KELM
308
4
0
07 Nov 2024
DELIFT: Data Efficient Language model Instruction Fine Tuning
International Conference on Learning Representations (ICLR), 2024
Ishika Agarwal
Krishnateja Killamsetty
Yatin Nandwani
Marina Danilevksy
ALM
VLM
714
8
0
07 Nov 2024
Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Daniel P. Jeong
Saurabh Garg
Zachary Chase Lipton
Michael Oberst
LM&MA
VLM
ELM
242
31
0
06 Nov 2024
A Bayesian Approach to Data Point Selection
Neural Information Processing Systems (NeurIPS), 2024
Xinnuo Xu
Minyoung Kim
Royson Lee
Brais Martínez
Timothy M. Hospedales
243
2
0
06 Nov 2024
Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models
Neural Information Processing Systems (NeurIPS), 2024
Minki Kang
Sung Ju Hwang
Gibbeum Lee
Jaewoong Cho
KELM
295
0
0
01 Nov 2024
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays
Neural Information Processing Systems (NeurIPS), 2024
Yang Zhou
Tan Li Hui Faith
Yanyu Xu
Sicong Leng
Xinxing Xu
Yong Liu
Rick Siow Mong Goh
SSL
VLM
LM&MA
MedIm
175
3
0
29 Oct 2024
RoBIn: A Transformer-Based Model For Risk Of Bias Inference With Machine Reading Comprehension
Journal of Biomedical Informatics (JBI), 2024
Abel Corrêa Dias
Viviane Pereira Moreira
João Luiz Dihl Comba
253
2
0
28 Oct 2024
TransformLLM: Adapting Large Language Models via LLM-Transformed Reading Comprehension Text
Iftach Arbel
Yehonathan Refael
Ofir Lindenbaum
AILaw
152
2
0
28 Oct 2024
Reducing the Scope of Language Models
David Yunis
Siyu Huo
Chulaka Gunasekara
Danish Contractor
KELM
274
0
0
28 Oct 2024
RSA-Control: A Pragmatics-Grounded Lightweight Controllable Text Generation Framework
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yifan Wang
Vera Demberg
219
9
0
24 Oct 2024
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Sergio Burdisso
S. Madikeri
P. Motlícek
342
6
0
24 Oct 2024
ZIP-FIT: Embedding-Free Data Selection via Compression-Based Alignment
Elyas Obbad
Iddah Mlauzi
Alycia Lee
Rylan Schaeffer
Kamal Obbad
Suhana Bedi
Sanmi Koyejo
CVBM
302
0
0
23 Oct 2024
DomainSum: A Hierarchical Benchmark for Fine-Grained Domain Shift in Abstractive Text Summarization
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Haohan Yuan
Haopeng Zhang
212
3
0
21 Oct 2024
Scalable Data Ablation Approximations for Language Models through Modular Training and Merging
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Clara Na
Ian H. Magnusson
A. Jha
Tom Sherborne
Emma Strubell
Jesse Dodge
Pradeep Dasigi
MoMe
165
8
0
21 Oct 2024
MELT: Materials-aware Continued Pre-training for Language Model Adaptation to Materials Science
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Junho Kim
Yeachan Kim
Jun-Hyung Park
Yerim Oh
Suho Kim
S. Lee
CLL
AI4CE
172
5
0
19 Oct 2024
From Babbling to Fluency: Evaluating the Evolution of Language Models in Terms of Human Language Acquisition
Qiyuan Yang
Pengda Wang
Luke D. Plonsky
Frederick L. Oswald
Hanjie Chen
ELM
233
3
0
17 Oct 2024
BanTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Fabiha Haider
Fariha Tanjim Shifat
Md Farhan Ishmam
Deeparghya Dutta Barua
Md Sakib Ul Rahman Sourove
Md Fahim
Md Farhad Alam
342
9
0
17 Oct 2024
SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Xianyang Zhan
Agam Goyal
Yilun Chen
Eshwar Chandrasekharan
Koustuv Saha
AI4MH
909
18
0
17 Oct 2024
Tracking Universal Features Through Fine-Tuning and Model Merging
Niels Horn
Desmond Elliott
MoMe
151
0
0
16 Oct 2024
Prompt Compression for Large Language Models: A Survey
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Zongqian Li
Yinhong Liu
Yixuan Su
Nigel Collier
MQ
310
42
0
16 Oct 2024
REFINE on Scarce Data: Retrieval Enhancement through Fine-Tuning via Model Fusion of Embedding Models
Applied Informatics (AI), 2024
Ambuje Gupta
Mrinal Rawat
Andreas Stolcke
Roberto Pieraccini
RALM
197
1
0
16 Oct 2024
Exploring Large Language Models for Hate Speech Detection in Rioplatense Spanish
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Juan Manuel Pérez
Paula Miguel
Viviana Cotik
116
5
0
16 Oct 2024
TSDS: Data Selection for Task-Specific Model Finetuning
Neural Information Processing Systems (NeurIPS), 2024
Zifan Liu
Amin Karbasi
Theodoros Rekatsinas
309
15
0
15 Oct 2024
LexSumm and LexT5: Benchmarking and Modeling Legal Summarization Tasks in English
T. Y. S. S. Santosh
Cornelius Weiss
Matthias Grabmair
AILaw
ELM
466
9
0
12 Oct 2024
ELICIT: LLM Augmentation via External In-Context Capability
International Conference on Learning Representations (ICLR), 2024
Futing Wang
Jianhao Yan
Yue Zhang
Tao Lin
382
6
0
12 Oct 2024
Extracting and Combining Abilities For Building Multi-lingual Ability-enhanced Large Language Models
Zhipeng Chen
Liang Song
K. Zhou
Wayne Xin Zhao
Binghai Wang
Weipeng Chen
Ji-Rong Wen
410
0
0
10 Oct 2024
Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
International Conference on Learning Representations (ICLR), 2024
Zeman Li
Xinwei Zhang
Peilin Zhong
Yuan Deng
Meisam Razaviyayn
Vahab Mirrokni
286
11
0
09 Oct 2024
Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Fırat Öncel
Matthias Bethge
Beyza Ermis
Mirco Ravanelli
Cem Subakan
Çağatay Yıldız
230
5
0
08 Oct 2024
From Tokens to Words: On the Inner Lexicon of LLMs
International Conference on Learning Representations (ICLR), 2024
Guy Kaplan
Matanel Oren
Yuval Reif
Roy Schwartz
443
30
0
08 Oct 2024
DEPT: Decoupled Embeddings for Pre-training Language Models
International Conference on Learning Representations (ICLR), 2024
Alex Iacob
Lorenzo Sani
Meghdad Kurmanji
William F. Shen
Xinchi Qiu
Dongqi Cai
Yan Gao
Nicholas D. Lane
VLM
1.4K
2
0
07 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
North American Chapter of the Association for Computational Linguistics (NAACL), 2024
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
525
3
0
06 Oct 2024
Entity Insertion in Multilingual Linked Corpora: The Case of Wikipedia
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Tomás Feith
Akhil Arora
Martin Gerlach
Debjit Paul
Robert West
KELM
239
7
0
05 Oct 2024
Still Not Quite There! Evaluating Large Language Models for Comorbid Mental Health Diagnosis
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Amey Hengle
Atharva Kulkarni
Shantanu Patankar
Madhumitha Chandrasekaran
Sneha D'Silva
Jemima Jacob
Rashmi Gupta
AI4MH
201
9
0
04 Oct 2024
Large Language Models can be Strong Self-Detoxifiers
Ching-Yun Ko
Pin-Yu Chen
Payel Das
Youssef Mroueh
Soham Dan
Georgios Kollias
Subhajit Chaudhury
Tejaswini Pedapati
Luca Daniel
173
5
0
04 Oct 2024
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-Augmented Generation
Tobias Leemann
Periklis Petridis
G. Vietri
Dionysis Manousakas
Aaron Roth
Sergul Aydore
444
0
0
04 Oct 2024
Dynamic Gradient Alignment for Online Data Mixing
Simin Fan
David Grangier
Pierre Ablin
155
8
0
03 Oct 2024
Comparing Criteria Development Across Domain Experts, Lay Users, and Models in Large Language Model Evaluation
Annalisa Szymanski
Simret Araya Gebreegziabher
Oghenemaro Anuyah
Ronald A Metoyer
Tao Li
ALM
ELM
200
14
0
02 Oct 2024
SciSafeEval: A Comprehensive Benchmark for Safety Alignment of Large Language Models in Scientific Tasks
Tianhao Li
Jingyu Lu
Chuangxin Chu
Tianyu Zeng
Yujia Zheng
...
Xuejing Yuan
Xingkai Wang
Keyan Ding
Huajun Chen
Qiang Zhang
ELM
268
20
0
02 Oct 2024
Mixing It Up: The Cocktail Effect of Multi-Task Fine-Tuning on LLM Performance -- A Case Study in Finance
Meni Brief
Oded Ovadia
Gil Shenderovitz
Noga Ben Yoash
Rachel Lemberg
Eitam Sheetrit
253
12
0
01 Oct 2024
AlignSum: Data Pyramid Hierarchical Fine-tuning for Aligning with Human Summarization Preference
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024
Yang Han
Yiming Wang
Rui Wang
Lu Chen
Kai Yu
AI4TS
ALM
150
5
0
01 Oct 2024
Evaluating the fairness of task-adaptive pretraining on unlabeled test data before few-shot text classification
Kush Dubey
219
3
0
30 Sep 2024
Classification of Radiological Text in Small and Imbalanced Datasets in a Non-English Language
Vincent Beliveau
Helene Kaas
Martin Prener
Claes N. Ladefoged
Desmond Elliott
Gitte M. Knudsen
Lars H. Pinborg
Melanie Ganz
131
2
0
30 Sep 2024
The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging
Masanori Hirano
Kentaro Imajo
MoMe
156
3
0
30 Sep 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
International Conference on Learning Representations (ICLR), 2024
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
456
12
0
30 Sep 2024
Do We Need Domain-Specific Embedding Models? An Empirical Investigation
Yixuan Tang
Yi Yang
AIFin
568
13
0
27 Sep 2024
Previous
1
2
3
4
5
6
...
26
27
28
Next
Page 5 of 28
Page
of 28
Go