EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records

Medical consultation dialogues contain critical clinical information, yet their unstructured nature hinders effective utilization in diagnosis and treatment. Traditional methods, relying on rule-based or shallow machine learning techniques, struggle to capture deep and implicit semantics. Recently, large pre-trained language models and Low-Rank Adaptation (LoRA), a lightweight fine-tuning method, have shown promise for structured information extraction. We propose EMRModel, a novel approach that integrates LoRA-based fine-tuning with code-style prompt design, aiming to efficiently convert medical consultation dialogues into structured electronic medical records (EMRs). Additionally, we construct a high-quality, realistically grounded dataset of medical consultation dialogues with detailed annotations. Furthermore, we introduce a fine-grained evaluation benchmark for medical consultation information extraction and provide a systematic evaluation methodology, advancing the optimization of medical natural language processing (NLP) models. Experimental results show EMRModel achieves an F1 score of 88.1%, improving by49.5% over standard pre-trained models. Compared to traditional LoRA fine-tuning methods, our model shows superior performance, highlighting its effectiveness in structured medical record extraction tasks.
View on arXiv@article{zhao2025_2504.16448, title={ EMRModel: A Large Language Model for Extracting Medical Consultation Dialogues into Structured Medical Records }, author={ Shuguang Zhao and Qiangzhong Feng and Zhiyang He and Peipei Sun and Yingying Wang and Xiaodong Tao and Xiaoliang Lu and Mei Cheng and Xinyue Wu and Yanyan Wang and Wei Liang }, journal={arXiv preprint arXiv:2504.16448}, year={ 2025 } }