A Systematic Analysis of Large Language Models with RAG-enabled Dynamic Prompting for Medical Error Detection and Correction

25 November 2025

Farzad Ahmed

Joniel Augustine Jerome

Meliha Yetisgen

Özlem Uzuner

ArXiv (abs)PDF HTML Github (37★)

Main:22 Pages

2 Figures

17 Tables

Appendix:6 Pages

Abstract

Objective: Clinical documentation contains factual, diagnostic, and management errors that can compromise patient safety. Large language models (LLMs) may help detect and correct such errors, but their behavior under different prompting strategies remains unclear. We evaluate zero-shot prompting, static prompting with random exemplars (SPR), and retrieval-augmented dynamic prompting (RDP) for three subtasks of medical error processing: error flag detection, error sentence detection, and error correction.

View on arXiv

Comments on this paper