112
0

Time to Embed: Unlocking Foundation Models for Time Series with Channel Descriptions

Main:8 Pages
28 Figures
Bibliography:5 Pages
16 Tables
Appendix:22 Pages
Abstract

Traditional time series models are task-specific and often depend on dataset-specific training and extensive feature engineering. While Transformer-based architectures have improved scalability, foundation models, commonplace in text, vision, and audio, remain under-explored for time series and are largely restricted to forecasting. We introduce CHARM\textbf{CHARM}, a foundation embedding model for multivariate time series that learns shared, transferable, and domain-aware representations. To address the unique difficulties of time series foundation learning, CHARM\textbf{CHARM} incorporates architectural innovations that integrate channel-level textual descriptions while remaining invariant to channel order. The model is trained using a Joint Embedding Predictive Architecture (JEPA), with novel augmentation schemes and a loss function designed to improve interpretability and training stability. Our 77M-parameter model achieves state-of-the-art performance across diverse downstream tasks, setting a new benchmark for time series representation learning.

View on arXiv
Comments on this paper