Improving Clinical NLP Performance through Language Model-Generated Synthetic Clinical Data
Shan Chen
Jack Gallifant
Marco Guevara
Yanjun Gao
Majid Afshar
Timothy A. Miller
Dmitriy Dligach
Danielle S. Bitterman

Abstract
Generative models have been showing potential for producing data in mass. This study explores the enhancement of clinical natural language processing performance by utilizing synthetic data generated from advanced language models. Promising results show feasible applications in such a high-stakes domain.
View on arXivComments on this paper