225

Med-EASi: Finely Annotated Dataset and Models for Controllable Simplification of Medical Texts

AAAI Conference on Artificial Intelligence (AAAI), 2023
Abstract

Automatic medical text simplification can assist providers with patient-friendly communication and make medical texts more accessible, thereby improving health literacy. But curating a quality corpus for this task requires the supervision of medical experts. In this work, we present Med-EASi\textbf{Med-EASi} (Med\underline{\textbf{Med}}ical dataset for E\underline{\textbf{E}}laborative and A\underline{\textbf{A}}bstractive Si\underline{\textbf{Si}}mplification), a uniquely crowdsourced and finely annotated dataset for supervised simplification of short medical texts. Its expert-layman-AI collaborative\textit{expert-layman-AI collaborative} annotations facilitate controllability\textit{controllability} over text simplification by marking four kinds of textual transformations: elaboration, replacement, deletion, and insertion. To learn medical text simplification, we fine-tune T5-large with four different styles of input-output combinations, leading to two control-free and two controllable versions of the model. We add two types of controllability\textit{controllability} into text simplification, by using a multi-angle training approach: position-aware\textit{position-aware}, which uses in-place annotated inputs and outputs, and position-agnostic\textit{position-agnostic}, where the model only knows the contents to be edited, but not their positions. Our results show that our fine-grained annotations improve learning compared to the unannotated baseline. Furthermore, position-aware\textit{position-aware} control generates better simplification than the position-agnostic\textit{position-agnostic} one. The data and code are available at https://github.com/Chandrayee/CTRL-SIMP.

View on arXiv
Comments on this paper