STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

14 April 2025

Maxmilian Forstenhäusler

Daniel Külzer

Christos Anagnostopoulos

Shameem Puthiya Parambath

Natascha Weber

AI4TS

MedIm

ArXiv (abs)PDF HTML Github (6★)

Main:10 Pages

16 Figures

Bibliography:10 Pages

34 Tables

Appendix:42 Pages

Abstract

Accurate predictions using sequential spatiotemporal data are crucial for various applications. Utilizing real-world data, we aim to learn the intent of a smart device user within confined areas of a vehicle's surroundings. However, in real-world scenarios, environmental factors and sensor limitations result in non-stationary and irregularly sampled data, posing significant challenges. To address these issues, we developed a Transformer-based approach, STaRFormer, which serves as a universal framework for sequential modeling. STaRFormer employs a novel, dynamic attention-based regional masking scheme combined with semi-supervised contrastive learning to enhance task-specific latent representations. Comprehensive experiments on 15 datasets varying in types (including non-stationary and irregularly sampled), domains, sequence lengths, training samples, and applications, demonstrate the efficacy and practicality of STaRFormer. We achieve notable improvements over state-of-the-art approaches. Code and data will be made available.

View on arXiv

Comments on this paper