ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Join Slack
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2026 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2507.05517
196
3
v1v2v3 (latest)

Empowering Healthcare Practitioners with Language Models: Structuring Speech Transcripts in Two Real-World Clinical Applications

7 July 2025
Jean-Philippe Corbeil
Asma Ben Abacha
George Michalopoulos
Phillip Swazinna
Miguel Del-Agua
Jerome Tremblay
Akila Jeeson Daniel
Cari Bader
Yu-Cheng Cho
Pooja Krishnan
Nathan Bodenstab
Thomas Lin
Wenxuan Teng
François Beaulieu
Paul Vozila
    LM&MA
ArXiv (abs)PDFHTML
Main:6 Pages
3 Figures
Bibliography:3 Pages
5 Tables
Appendix:3 Pages
Abstract

Large language models (LLMs) such as GPT-4o and o1 have demonstrated strong performance on clinical natural language processing (NLP) tasks across multiple medical benchmarks. Nonetheless, two high-impact NLP tasks - structured tabular reporting from nurse dictations and medical order extraction from doctor-patient consultations - remain underexplored due to data scarcity and sensitivity, despite active industry efforts. Practical solutions to these real-world clinical tasks can significantly reduce the documentation burden on healthcare providers, allowing greater focus on patient care. In this paper, we investigate these two challenging tasks using private and open-source clinical datasets, evaluating the performance of both open- and closed-weight LLMs, and analyzing their respective strengths and limitations. Furthermore, we propose an agentic pipeline for generating realistic, non-sensitive nurse dictations, enabling structured extraction of clinical observations. To support further research in both areas, we release SYNUR and SIMORD, the first open-source datasets for nurse observation extraction and medical order extraction.

View on arXiv
Comments on this paper