47
1

Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs

Abstract

Recent advances in multimodal ECG representation learning center on aligning ECG signals with paired free-text reports. However, suboptimal alignment persists due to the complexity of medical language and the reliance on a full 12-lead setup, which is often unavailable in under-resourced settings. To tackle these issues, we propose **K-MERL**, a knowledge-enhanced multimodal ECG representation learning framework. **K-MERL** leverages large language models to extract structured knowledge from free-text reports and employs a lead-aware ECG encoder with dynamic lead masking to accommodate arbitrary lead inputs. Evaluations on six external ECG datasets show that **K-MERL** achieves state-of-the-art performance in zero-shot classification and linear probing tasks, while delivering an average **16%** AUC improvement over existing methods in partial-lead zero-shot classification.

View on arXiv
@article{liu2025_2502.17900,
  title={ Knowledge-enhanced Multimodal ECG Representation Learning with Arbitrary-Lead Inputs },
  author={ Che Liu and Cheng Ouyang and Zhongwei Wan and Haozhe Wang and Wenjia Bai and Rossella Arcucci },
  journal={arXiv preprint arXiv:2502.17900},
  year={ 2025 }
}
Comments on this paper