8
0

RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representation from X-Ray Images

Guang Li
Ren Togo
Takahiro Ogawa
Miki Haseyama
Abstract

Purpose: Self-supervised learning has been gaining attention in the medical field for its potential to improve computer-aided diagnosis. One popular method of self-supervised learning is masked image modeling (MIM), which involves masking a subset of input pixels and predicting the masked pixels. However, traditional MIM methods typically use a random masking strategy, which may not be ideal for medical images that often have a small region of interest for disease detection. To address this issue, this work aims to improve MIM for medical images and evaluate its effectiveness in an open X-ray image dataset. Methods: In this paper, we present a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representation from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. The proposed method was contrasted with five self-supervised learning techniques (MAE, SKD, Cross, BYOL, and, SimSiam). We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. Results: When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set (846 and 1,693 images) compared to other methods, and achieved a 0.957 detection accuracy even when only 50% of the training set was used. Conclusions: RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.

View on arXiv
Comments on this paper