218
v1v2 (latest)

Frequency-Aware Masked Autoencoders for Human Activity Recognition using Accelerometers

Main:7 Pages
4 Figures
4 Tables
Abstract

Wearable accelerometers are widely used for continuous monitoring of physical activity. Supervised machine learning and deep learning algorithms have long been used to extract meaningful activity information from raw accelerometry data, but progress has been hampered by the limited amount of labeled data that is publicly available. Exploiting large unlabeled datasets using self-supervised pretraining is a relatively new and underexplored approach in the field of human activity recognition (HAR). We used a time-series transformer masked autoencoder (MAE) approach to self-supervised pretraining and propose two novel spectrogram-based loss functions: the log-scale meanmagnitude (LMM) and log-scale magnitude variance (LMV) losses. We compared these losses with the mean squared error (MSE) loss for MAE training. We leveraged the large unlabeled UK Biobank accelerometry dataset (n = 109k) for pretraining and evaluated downstream HAR performance using a linear classifier in a smaller labelled dataset. We found that pretraining with the LMM loss improved performance compared to an MAE pretrained with the MSE loss, with 12.7% increase in subject-wise F1 score when using linear probing. Compared with a state-of-the-art ResNet-based HAR model, our LMM-pretrained transformer models performed better (+9.8% F1) with linear probing and comparably when fine-tuned using an LSTM classifier. The addition of the LMV to the LMM loss decreased performance compared to the LMM loss alone. These findings establish the LMM loss as a robust and effective method for pretraining MAE models on accelerometer data for HAR and show the potential of pretraining sequence-based models for free-living HAR.

View on arXiv
Comments on this paper