ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.05638
40
1

ELMTEX: Fine-Tuning Large Language Models for Structured Clinical Information Extraction. A Case Study on Clinical Reports

8 February 2025
Aynur Guluzade
Naguib Heiba
Zeyd Boukhers
Florim Hamiti
Jahid Hasan Polash
Yehya Mohamad
Carlos A Velasco
    LM&MA
ArXivPDFHTML
Abstract

Europe's healthcare systems require enhanced interoperability and digitalization, driving a demand for innovative solutions to process legacy clinical data. This paper presents the results of our project, which aims to leverage Large Language Models (LLMs) to extract structured information from unstructured clinical reports, focusing on patient history, diagnoses, treatments, and other predefined categories. We developed a workflow with a user interface and evaluated LLMs of varying sizes through prompting strategies and fine-tuning. Our results show that fine-tuned smaller models match or surpass larger counterparts in performance, offering efficiency for resource-limited settings. A new dataset of 60,000 annotated English clinical summaries and 24,000 German translations was validated with automated and manual checks. The evaluations used ROUGE, BERTScore, and entity-level metrics. The work highlights the approach's viability and outlines future improvements.

View on arXiv
@article{guluzade2025_2502.05638,
  title={ ELMTEX: Fine-Tuning Large Language Models for Structured Clinical Information Extraction. A Case Study on Clinical Reports },
  author={ Aynur Guluzade and Naguib Heiba and Zeyd Boukhers and Florim Hamiti and Jahid Hasan Polash and Yehya Mohamad and Carlos A Velasco },
  journal={arXiv preprint arXiv:2502.05638},
  year={ 2025 }
}
Comments on this paper