Cross-head mutual Mean-Teaching for semi-supervised medical image
segmentation
Semi-supervised medical image segmentation (SSMIS) has witnessed substantial advancements by leveraging limited labeled data and abundant unlabeled data. Nevertheless, existing state-of-the-art methods encounter challenges in accurately predicting labels for the unlabeled data, resulting in disruptive noise during training and susceptibility to erroneous information overfitting. Additionally, applying perturbations to inaccurate predictions further reduces consistent learning. To address these concerns, a novel \textbf{C}ross-head \textbf{m}utual \textbf{m}ean-\textbf{t}eaching Network (CMMT-Net) is proposed to address these issues. The CMMT-Net comprises teacher-student networks and incorporates strong-weak data augmentation within a shared encoder, facilitating cross-head co-training by capitalizing on both self-training and consistent learning. The consistent learning is enhanced by averaging teacher networks and mutual virtual adversarial training, leading to deterministic and higher-quality predictions. The diversity of consistency training samples can be enhanced through the use of Cross-Set CutMix, which also helps mitigate issues related to distribution mismatch. Notably, CMMT-Net simultaneously implements data-level, feature-level, and network-level perturbations, boosting model diversity and generalization performance. The proposed method consistently outperforms existing SSMIS methods on three publicly available datasets across various semi-supervised settings. Code and logs will be available at \url{https://github.com/Leesoon1984/CMMT-Net}.
View on arXiv