Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2004.10964
Cited By
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
23 April 2020
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Don't Stop Pretraining: Adapt Language Models to Domains and Tasks"
50 / 332 papers shown
Title
The Utility of Large Language Models and Generative AI for Education Research
Andrew Katz
Umair Shakir
B. Chambers
AI4CE
18
6
0
29 May 2023
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Christoforos Vasilatos
Manaar Alam
Talal Rahwan
Yasir Zaki
Michail Maniatakos
DeLMO
32
32
0
26 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
24
4
0
23 May 2023
APPLS: Evaluating Evaluation Metrics for Plain Language Summarization
Yue Guo
Tal August
Gondy Leroy
T. Cohen
Lucy Lu Wang
55
9
0
23 May 2023
Rethinking Semi-supervised Learning with Language Models
Zhengxiang Shi
Francesco Tonolini
Nikolaos Aletras
Emine Yilmaz
G. Kazai
Yunlong Jiao
27
17
0
22 May 2023
Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Xiao Wang
Wei Zhou
Qi Zhang
Jie Zhou
Songyang Gao
Junzhe Wang
Menghan Zhang
Xiang Gao
Yunwen Chen
Tao Gui
34
7
0
22 May 2023
Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Xinlu Zhang
Shiyang Li
Xianjun Yang
Chenxin Tian
Yao Qin
Linda R. Petzold
19
9
0
22 May 2023
Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis
Seraphina Goldfarb-Tarrant
Bjorn Ross
Adam Lopez
27
7
0
22 May 2023
Teaching the Pre-trained Model to Generate Simple Texts for Text Simplification
Renliang Sun
Wei-ping Xu
Xiaojun Wan
CLL
14
16
0
21 May 2023
ESCOXLM-R: Multilingual Taxonomy-driven Pre-training for the Job Market Domain
Mike Zhang
Rob van der Goot
Barbara Plank
11
14
0
20 May 2023
Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews
Hye Sun Yun
Iain J. Marshall
T. Trikalinos
Byron C. Wallace
19
16
0
19 May 2023
"Nothing Abnormal": Disambiguating Medical Reports via Contrastive Knowledge Infusion
Zexue He
An Yan
Amilcare Gentili
Julian McAuley
Chun-Nan Hsu
MedIm
14
2
0
15 May 2023
CroSentiNews 2.0: A Sentence-Level News Sentiment Corpus
Gaurish Thakkar
Nives Mikelic Preradović
Marko Tadić
6
1
0
14 May 2023
How to Train Your CheXDragon: Training Chest X-Ray Models for Transfer to Novel Tasks and Healthcare Systems
Cara Van Uden
Jeremy Irvin
Mars Huang
N. Dean
J. Carr
A. Ng
C. Langlotz
OOD
24
1
0
13 May 2023
When and What to Ask Through World States and Text Instructions: IGLU NLP Challenge Solution
Zhengxiang Shi
Jerome Ramos
To Eun Kim
Xi Wang
Hossein A. Rahmani
Aldo Lipani
21
10
0
09 May 2023
Going beyond research datasets: Novel intent discovery in the industry setting
Aleksandra Chrabrowa
Tsimur Hadeliya
D. Kajtoch
Robert Mroczkowski
Piotr Rybak
8
2
0
09 May 2023
OPI at SemEval 2023 Task 9: A Simple But Effective Approach to Multilingual Tweet Intimacy Analysis
Slawomir Dadas
11
2
0
14 Apr 2023
Attention at SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS)
Debashish Roy
Manish Shrivastava
23
1
0
10 Apr 2023
Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Emilio Ferrara
SILM
17
247
0
07 Apr 2023
BloombergGPT: A Large Language Model for Finance
Shijie Wu
Ozan Irsoy
Steven Lu
Vadim Dabravolski
Mark Dredze
Sebastian Gehrmann
P. Kambadur
David S. Rosenberg
Gideon Mann
AIFin
51
780
0
30 Mar 2023
End-to-End
n
n
n
-ary Relation Extraction for Combination Drug Therapies
Yuhang Jiang
Ramakanth Kavuluru
21
7
0
29 Mar 2023
SwissBERT: The Multilingual Language Model for Switzerland
Jannis Vamvas
Johannes Graen
Rico Sennrich
25
6
0
23 Mar 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
24
501
0
07 Mar 2023
DiTTO: A Feature Representation Imitation Approach for Improving Cross-Lingual Transfer
Shanu Kumar
Abbaraju Soujanya
Sandipan Dandapat
Sunayana Sitaram
Monojit Choudhury
VLM
25
1
0
04 Mar 2023
CLICKER: Attention-Based Cross-Lingual Commonsense Knowledge Transfer
Ruolin Su
Zhongkai Sun
Sixing Lu
Chengyuan Ma
Chenlei Guo
LRM
21
0
0
26 Feb 2023
k
k
k
NN-Adapter: Efficient Domain Adaptation for Black-Box Language Models
Yangsibo Huang
Daogao Liu
Zexuan Zhong
Weijia Shi
Y. Lee
RALM
ALM
8
14
0
21 Feb 2023
Auditing large language models: a three-layered approach
Jakob Mokander
Jonas Schuett
Hannah Rose Kirk
Luciano Floridi
AILaw
MLAU
34
194
0
16 Feb 2023
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou
Matthew E. Peters
Alexander M. Fraser
Jesse Dodge
MoMe
21
65
0
14 Feb 2023
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code
Shuyan Zhou
Uri Alon
Sumit Agarwal
Graham Neubig
ELM
ALM
22
98
0
10 Feb 2023
Leveraging supplementary text data to kick-start automatic speech recognition system development with limited transcriptions
Nay San
Martijn Bartelds
Blaine Billings
Ella de Falco
Hendi Feriza
Johan Safri
Wawan Sahrozi
Ben Foley
Bradley McDonnell
Dan Jurafsky
13
9
0
09 Feb 2023
Towards Geospatial Foundation Models via Continual Pretraining
Matías Mendieta
Boran Han
Xingjian Shi
Yi Zhu
Chen Chen
VLM
AI4CE
38
63
0
09 Feb 2023
Learning Optimal Features via Partial Invariance
Moulik Choraria
Ibtihal Ferwana
Ankur Mani
L. Varshney
OOD
21
2
0
28 Jan 2023
Open Problems in Applied Deep Learning
M. Raissi
AI4CE
24
2
0
26 Jan 2023
Multi-Tenant Optimization For Few-Shot Task-Oriented FAQ Retrieval
Asha Vishwanathan
R. Warrier
G. V. Suresh
Chandrashekhar Kandpal
11
2
0
25 Jan 2023
Audience-Centric Natural Language Generation via Style Infusion
Samraj Moorjani
A. Krishnan
Hari Sundaram
E. Maslowska
Aravind Sankar
13
4
0
24 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Adapting Multilingual Speech Representation Model for a New, Underresourced Language through Multilingual Fine-tuning and Continued Pretraining
Karol Nowakowski
M. Ptaszynski
Kyoko Murasaki
Jagna Nieuwazny
15
23
0
18 Jan 2023
Does compressing activations help model parallel training?
S. Bian
Dacheng Li
Hongyi Wang
Eric P. Xing
Shivaram Venkataraman
19
4
0
06 Jan 2023
CiT: Curation in Training for Effective Vision-Language Data
Hu Xu
Saining Xie
Po-Yao (Bernie) Huang
Licheng Yu
Russ Howes
Gargi Ghosh
Luke Zettlemoyer
Christoph Feichtenhofer
VLM
DiffM
25
24
0
05 Jan 2023
Second Thoughts are Best: Learning to Re-Align With Human Values from Text Edits
Ruibo Liu
Chenyan Jia
Ge Zhang
Ziyu Zhuang
Tony X. Liu
Soroush Vosoughi
90
34
0
01 Jan 2023
Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Georgios Paraskevopoulos
Theodoros Kouzelis
Georgios Rouvalis
Athanasios Katsamanis
V. Katsouros
Alexandros Potamianos
VLM
18
7
0
31 Dec 2022
Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents
Sayar Ghosh Roy
Anshul Padhi
Risubh Jain
Manish Gupta
Vasudeva Varma
AI4TS
20
2
0
31 Dec 2022
Continual Contrastive Finetuning Improves Low-Resource Relation Extraction
Wenxuan Zhou
Sheng Zhang
Tristan Naumann
Muhao Chen
Hoifung Poon
43
6
0
21 Dec 2022
One Embedder, Any Task: Instruction-Finetuned Text Embeddings
Hongjin Su
Weijia Shi
Jungo Kasai
Yizhong Wang
Yushi Hu
Mari Ostendorf
Wen-tau Yih
Noah A. Smith
Luke Zettlemoyer
Tao Yu
25
278
0
19 Dec 2022
DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text Generation
Yuxi Feng
Xiaoyuan Yi
Xiting Wang
L. Lakshmanan
Xing Xie
DiffM
27
5
0
16 Dec 2022
G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks
Zhongwei Wan
Yichun Yin
Wei Zhang
Jiaxin Shi
Lifeng Shang
Guangyong Chen
Xin Jiang
Qun Liu
VLM
CLL
26
16
0
07 Dec 2022
Data-Efficient Finetuning Using Cross-Task Nearest Neighbors
Hamish Ivison
Noah A. Smith
Hannaneh Hajishirzi
Pradeep Dasigi
31
19
0
01 Dec 2022
ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format
Qi Zhu
Christian Geishauser
Hsien-Chin Lin
Carel van Niekerk
Baolin Peng
...
Dazhen Wan
Xiaochen Zhu
Jianfeng Gao
Milica Gavsić
Minlie Huang
43
23
0
30 Nov 2022
Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models
Lei Wang
Jian He
Xingdong Xu
Ning Liu
Hui-juan Liu
27
2
0
27 Nov 2022
Detecting Entities in the Astrophysics Literature: A Comparison of Word-based and Span-based Entity Recognition Methods
Xiang Dai
Sarvnaz Karimi
8
3
0
24 Nov 2022
Previous
1
2
3
4
5
6
7
Next