ViTime: A Visual Intelligence-Based Foundation Model for Time Series Forecasting

Time series forecasting (TSF) possesses great practical values in various fields, including power and energy, transportation, etc. TSF methods have been studied based on knowledge from classical statistics to modern deep learning. Yet, all of them were developed based on one fundamental concept, the numerical data fitting. Thus, the models developed have been long known for being problem-specific and lacking application generalizability. A TSF foundation model serving TSF tasks across different applications can reverse such an impression. The central question is then how to develop such a TSF foundation model. This paper offers a pioneering study in developing a TSF foundation model and proposes a vision intelligence-powered framework, ViTime, for the first time. In ViTime, a method synthesizing authentic time series periodic and trend patterns is developed to enrich sample pattern diversity. A deep architecture operating TSF in image metric space is designed to achieve significantly enhanced TSF generalizability. Extensive experiments demonstrate ViTime's SOTA performance across multiple settings. In zero-shot scenarios, ViTime outperforms TimesFM by 9-15%. With just 10% fine-tuning data, ViTime surpasses both foundation models and fully-supervised benchmarks trained on complete datasets, with this performance gap widening further at 100\% fine-tuning. Additionally, ViTime exhibits exceptional robustness, handling missing data without imputation and outperforming TimesFM by 20-30% under various data perturbations.
View on arXiv@article{yang2025_2407.07311, title={ ViTime: A Visual Intelligence-Based Foundation Model for Time Series Forecasting }, author={ Luoxiao Yang and Yun Wang and Xinqi Fan and Israel Cohen and Jingdong Chen and Yue Zhao and Zijun Zhang }, journal={arXiv preprint arXiv:2407.07311}, year={ 2025 } }