56
0

Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation

Abstract

Foundation models for time series analysis (TSA) have attracted significant attention. However, challenges such as data scarcity and data imbalance continue to hinder their development. To address this, we consider modeling complex systems through symbolic expressions that serve as semantic descriptors of time series. Building on this concept, we introduce a series-symbol (S2) dual-modulity data generation mechanism, enabling the unrestricted creation of high-quality time series data paired with corresponding symbolic representations. Leveraging the S2 dataset, we develop SymTime, a pre-trained foundation model for TSA. SymTime demonstrates competitive performance across five major TSA tasks when fine-tuned with downstream task, rivaling foundation models pre-trained on real-world datasets. This approach underscores the potential of dual-modality data generation and pretraining mechanisms in overcoming data scarcity and enhancing task performance.

View on arXiv
@article{wang2025_2502.15466,
  title={ Mitigating Data Scarcity in Time Series Analysis: A Foundation Model with Series-Symbol Data Generation },
  author={ Wenxuan Wang and Kai Wu and Yujian Betterest Li and Dan Wang and Xiaoyu Zhang and Jing Liu },
  journal={arXiv preprint arXiv:2502.15466},
  year={ 2025 }
}
Comments on this paper