37
1

DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning

Abstract

Multimodal learning integrates complementary information from diverse modalities to enhance the decision-making process. However, the potential of multimodal collaboration remains under-exploited due to disparities in data quality and modality representation capabilities. To address this, we introduce DynCIM, a novel dynamic curriculum learning framework designed to quantify the inherent imbalances from both sample and modality perspectives. DynCIM employs a sample-level curriculum to dynamically assess each sample's difficulty according to prediction deviation, consistency, and stability, while a modality-level curriculum measures modality contributions from global and local. Furthermore, a gating-based dynamic fusion mechanism is introduced to adaptively adjust modality contributions, minimizing redundancy and optimizing fusion effectiveness. Extensive experiments on six multimodal benchmarking datasets, spanning both bimodal and trimodal scenarios, demonstrate that DynCIM consistently outperforms state-of-the-art methods. Our approach effectively mitigates modality and sample imbalances while enhancing adaptability and robustness in multimodal learning tasks. Our code is available atthis https URL.

View on arXiv
@article{qian2025_2503.06456,
  title={ DynCIM: Dynamic Curriculum for Imbalanced Multimodal Learning },
  author={ Chengxuan Qian and Kai Han and Jingchao Wang and Zhenlong Yuan and Chongwen Lyu and Jun Chen and Zhe Liu },
  journal={arXiv preprint arXiv:2503.06456},
  year={ 2025 }
}
Comments on this paper