Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models

14 March 2025

Abstract

Time series analysis is crucial for understanding dynamics of complex systems. Recent advances in foundation models have led to task-agnostic Time Series Foundation Models (TSFMs) and Large Language Model-based Time Series Models (TSLLMs), enabling generalized learning and integrating contextual information. However, their success depends on large, diverse, and high-quality datasets, which are challenging to build due to regulatory, diversity, quality, and quantity constraints. Synthetic data emerge as a viable solution, addressing these challenges by offering scalable, unbiased, and high-quality alternatives. This survey provides a comprehensive review of synthetic data for TSFMs and TSLLMs, analyzing data generation strategies, their role in model pretraining, fine-tuning, and evaluation, and identifying future research directions.

View on arXiv

@article{liu2025_2503.11411,
  title={ Empowering Time Series Analysis with Synthetic Data: A Survey and Outlook in the Era of Foundation Models },
  author={ Xu Liu and Taha Aksu and Juncheng Liu and Qingsong Wen and Yuxuan Liang and Caiming Xiong and Silvio Savarese and Doyen Sahoo and Junnan Li and Chenghao Liu },
  journal={arXiv preprint arXiv:2503.11411},
  year={ 2025 }
}

Comments on this paper