27
4

Semi-supervised Concept Bottleneck Models

Abstract

Concept Bottleneck Models (CBMs) have garnered increasing attention due to their ability to provide concept-based explanations for black-box deep learning models while achieving high final prediction accuracy using human-like concepts. However, the training of current CBMs is heavily dependent on the precision and richness of the annotated concepts in the dataset. These concept labels are typically provided by experts, which can be costly and require significant resources and effort. Additionally, concept saliency maps frequently misalign with input saliency maps, causing concept predictions to correspond to irrelevant input features - an issue related to annotation alignment. To address these limitations, we propose a new framework called SSCBM (Semi-supervised Concept Bottleneck Model). Our SSCBM is suitable for practical situations where annotated data is scarce. By leveraging joint training on both labeled and unlabeled data and aligning the unlabeled data at the concept level, we effectively solve these issues. We proposed a strategy to generate pseudo labels and an alignment loss. Experiments demonstrate that our SSCBM is both effective and efficient. With only 10% labeled data, our model's concept and task accuracy on average across four datasets is only 2.44% and 3.93% lower, respectively, compared to the best baseline in the fully supervised learning setting.

View on arXiv
@article{hu2025_2406.18992,
  title={ Semi-supervised Concept Bottleneck Models },
  author={ Lijie Hu and Tianhao Huang and Huanyi Xie and Xilin Gong and Chenyang Ren and Zhengyu Hu and Lu Yu and Ping Ma and Di Wang },
  journal={arXiv preprint arXiv:2406.18992},
  year={ 2025 }
}
Comments on this paper