WiFi fingerprint-based localization has been studied intensively. Point-based solutions rely on position annotations of WiFi fingerprints. Trajectory-based solutions, however, require end-position annotations of WiFi trajectories, where a WiFi trajectory is a multivariate time series of signal features. A trajectory dataset is much larger than a pointwise dataset as the number of potential trajectories in a field may grow exponentially with respect to the size of the field. This work presents a semi-self representation learning solution, where a large dataset of crowdsourced unlabeled WiFi trajectories can be automatically labeled by a much smaller dataset of labeled WiFi trajectories. The size of only needs to be proportional to the size of the physical field, while the unlabeled could be much larger. This is made possible through a novel ``cut-and-flip'' augmentation scheme based on the meet-in-the-middle paradigm. A two-stage learning consisting of trajectory embedding followed by endpoint embedding is proposed for the unlabeled . Then the learned representations are labeled by and connected to a neural-based localization network. The result, while delivering promising accuracy, significantly relieves the burden of human annotations for trajectory-based localization.
View on arXiv@article{kuo2025_2504.03756, title={ Semi-Self Representation Learning for Crowdsourced WiFi Trajectories }, author={ Yu-Lin Kuo and Yu-Chee Tseng and Ting-Hui Chiang and Yan-Ann Chen }, journal={arXiv preprint arXiv:2504.03756}, year={ 2025 } }