v1v2 (latest)
Facial Affect Recognition based on Multi Architecture Encoder and
Feature Fusion for the ABAW7 Challenge
- CVBM
Main:7 Pages
1 Figures
Bibliography:3 Pages
1 Tables
Abstract
In this paper, we present our approach to addressing the challenges of the 7th ABAW competition. The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection. To tackle these challenges, we employ state-of-the-art models to extract powerful visual features. Subsequently, a Transformer Encoder is utilized to integrate these features for the VA, Expr, and AU sub-challenges. To mitigate the impact of varying feature dimensions, we introduce an affine module to align the features to a common dimension. Overall, our results significantly outperform the baselines.
View on arXivComments on this paper
