21
0

Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification

Abstract

Clinical natural language processing (NLP) is increasingly in demand in both clinical research and operational practice. However, most of the state-of-the-art solutions are transformers-based and require high computational resources, limiting their accessibility. We propose a hybrid NLP framework that integrates rule-based filtering, a Support Vector Machine (SVM) classifier, and a BERT-based model to improve efficiency while maintaining accuracy. We applied this framework in a dementia identification case study involving 4.9 million veterans with incident hypertension, analyzing 2.1 billion clinical notes. At the patient level, our method achieved a precision of 0.90, a recall of 0.84, and an F1-score of 0.87. Additionally, this NLP approach identified over three times as many dementia cases as structured data methods. All processing was completed in approximately two weeks using a single machine with dual A40 GPUs. This study demonstrates the feasibility of hybrid NLP solutions for large-scale clinical text analysis, making state-of-the-art methods more accessible to healthcare organizations with limited computational resources.

View on arXiv
@article{shi2025_2504.12494,
  title={ Accelerating Clinical NLP at Scale with a Hybrid Framework with Reduced GPU Demands: A Case Study in Dementia Identification },
  author={ Jianlin Shi and Qiwei Gan and Elizabeth Hanchrow and Annie Bowles and John Stanley and Adam P. Bress and Jordana B. Cohen and Patrick R. Alba },
  journal={arXiv preprint arXiv:2504.12494},
  year={ 2025 }
}
Comments on this paper