v1v2 (latest)

MuSACo: Multimodal Subject-Specific Selection and Adaptation for Expression Recognition with Co-Training

17 August 2025

Muhammad Osama Zeeshan

ArXiv (abs)PDF HTML Github

Main:15 Pages

7 Figures

Bibliography:4 Pages

22 Tables

Abstract

Personalized expression recognition (ER) involves adapting a machine learning model to subject-specific data for improved recognition of expressions with considerable interpersonal variability. Subject-specific ER can benefit significantly from multi-source domain adaptation (MSDA) methods, where each domain corresponds to a specific subject to improve model accuracy and robustness. Despite promising results, state-of-the-art MSDA approaches often overlook multimodal information or blend sources into a single domain, limiting subject diversity and failing to explicitly capture unique subject-specific characteristics. To address these limitations, we introduce MuSACo, a multimodal subject-specific selection and adaptation method for ER based on co-training. It leverages complementary information across multiple modalities and multiple source domains for subject-specific adaptation. This makes MuSACo particularly relevant for affective computing applications in digital health, such as patient-specific assessment for stress or pain, where subject-level nuances are crucial. MuSACo selects source subjects relevant to the target and generates pseudo-labels using the dominant modality for class-aware learning, in conjunction with a class-agnostic loss to learn from less confident target samples. Finally, source features from each modality are aligned, while only confident target features are combined. Experimental results on challenging multimodal ER datasets: BioVid, StressID, and BAH show that MuSACo outperforms UDA (blending) and state-of-the-art MSDA methods.

View on arXiv