Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting

Transformer-based time series forecasting has recently gained strong interest due to the ability of transformers to model sequential data. Most of the state-of-the-art architectures exploit either temporal or inter-channel dependencies, limiting their effectiveness in multivariate time-series forecasting where both types of dependencies are crucial. We propose Sentinel, a full transformer-based architecture composed of an encoder able to extract contextual information from the channel dimension, and a decoder designed to capture causal relations and dependencies across the temporal dimension. Additionally, we introduce a multi-patch attention mechanism, which leverages the patching process to structure the input sequence in a way that can be naturally integrated into the transformer architecture, replacing the multi-head splitting process. Extensive experiments on standard benchmarks demonstrate that Sentinel, because of its ability to "monitor" both the temporal and the inter-channel dimension, achieves better or comparable performance with respect to state-of-the-art approaches.
View on arXiv@article{villaboni2025_2503.17658, title={ Sentinel: Multi-Patch Transformer with Temporal and Channel Attention for Time Series Forecasting }, author={ Davide Villaboni and Alberto Castellini and Ivan Luciano Danesi and Alessandro Farinelli }, journal={arXiv preprint arXiv:2503.17658}, year={ 2025 } }